Hello,
so i have to summarize some investmentfunds that kinda look like this:
fundid fundnr isin cusip calmt quarter month year
fs00008l08 1 US74316J7303 74316J730 631 210 8 2012
fs00008l08 1 US74316J7303 74316J730 639 213 4 2013
fs00008l08 1 US74316J7303 74316J730 661 220 2 2015
fs00008l08 1 US74316J7303 74316J730 632 210 9 2012
fs00008l0w 2 US00766Y4897 00766Y489 680 226 9 2016
fs00008l0w 2 US00766Y4897 00766Y489 701 233 6 2018
fs00008l0w 2 US00766Y4897 00766Y489 621 207 10 2011
fs00008l0w 2 US00766Y4897 00766Y489 668 222 9 2015 [.....]
mf_max_age
3.753593
3.753593
3.753593
3.753593
6.885695
6.885695
6.885695
6.885695
(simplified)
so following problem:
i am trying to find out the average max age for every fundnr, the following problem is that the fundnr are duplicates because they represent different months/years/returns within the fund but they have the same max age. So if I try 'mean(mf_max_age)' this will simply not work because of the unnecessary duplicates of the fundnr. I know I can delete them, but that would mess up the whole database, if i had only one fund of each fundnr left (i woud lose information on different years within the fundnrs and so on)
How can I avoid this problem? Maybe something like: mean(mf_max_age) for each fundnr only once?
Thank you for any recommendations!
Related Posts with duplicate data
xtivreg2 warning that covariance matrix is not of full rank. Detecting singletons?Dear all, I am running the following code: . xtset id year . xi: xtivreg2 `t' (`z'=instrument_p) …
Generate variable of ratios of consecutive observations, by subgroupHello. I am working with a panel dataset as characterized below. Code: * Example generated by -da…
How to identify parents not co-residing with their children using household dataDear all, I need to create a variable which identifies, for each individual, the following statuses:…
What are the advantage of adding interactive variable over subsampling in drawing conclusions?There are two main advantages of adding interaction variables over subsampling are: (1) having highe…
How to obtain ML (Cox-Snell)R2 after using seemly unrelated regression (sureg)Dear forum, I have a basic doubts of methodology for which many of you may know the answer. I am u…
Subscribe to:
Post Comments (Atom)
0 Response to duplicate data
Post a Comment