Hello,
so i have to summarize some investmentfunds that kinda look like this:
fundid fundnr isin cusip calmt quarter month year
fs00008l08 1 US74316J7303 74316J730 631 210 8 2012
fs00008l08 1 US74316J7303 74316J730 639 213 4 2013
fs00008l08 1 US74316J7303 74316J730 661 220 2 2015
fs00008l08 1 US74316J7303 74316J730 632 210 9 2012
fs00008l0w 2 US00766Y4897 00766Y489 680 226 9 2016
fs00008l0w 2 US00766Y4897 00766Y489 701 233 6 2018
fs00008l0w 2 US00766Y4897 00766Y489 621 207 10 2011
fs00008l0w 2 US00766Y4897 00766Y489 668 222 9 2015 [.....]
mf_max_age
3.753593
3.753593
3.753593
3.753593
6.885695
6.885695
6.885695
6.885695
(simplified)
so following problem:
i am trying to find out the average max age for every fundnr, the following problem is that the fundnr are duplicates because they represent different months/years/returns within the fund but they have the same max age. So if I try 'mean(mf_max_age)' this will simply not work because of the unnecessary duplicates of the fundnr. I know I can delete them, but that would mess up the whole database, if i had only one fund of each fundnr left (i woud lose information on different years within the fundnrs and so on)
How can I avoid this problem? Maybe something like: mean(mf_max_age) for each fundnr only once?
Thank you for any recommendations!
Related Posts with duplicate data
Stata commandHello. I am working with household survey dataset. I need to find the share of diversified farm hous…
kappaetc won't calculate confidence intervalsHi, I have a dataset with 23 raters (ie vars ID1 - ID23) who rated 7 statements (var 'statement') on…
Finding the earliest date within two boundariesI have three date variables corresponding to the beginning of three surveys for different individual…
Propensity score matchingHi, Our outcome variable is birthweight and independent variables are antenatal care, education, age…
Transform multiple response dataset from polytomous mode into indicator modeHello. I would like to ask if there is a Stata command to transform multiple response dataset from p…
Subscribe to:
Post Comments (Atom)
0 Response to duplicate data
Post a Comment