I have a dataset on ownership concetration for various firms with roughly 490000 observations . As i am interested in ownership concentrations for each particular firm, the problem i have with my dataset is that, ownership concentrations are not given collectively for the firm but rather independently for each individual within a firm. Hence, my questions is how i can aggregate the observations for the variables sh_direct1 and sh_total1 to one observations for each firm? Below you can find an example of my data
input str16 idnr double idnr1 str2 sh_type str6(sh_direct sh_total) long sh_date str24 repbas_header long sh_type1 float(sh_type2 firmid) long(sh_direct1 sh_total1) float sh_direct1temp
"DKFO3" 3 "I" ">5.00" "" 21598 "Consolidated data" 8 8 1 8877 . 26631
"DKFO3" 3 "I" ">5.00" "" 21598 "Consolidated data" 8 8 1 8877 . 26631
"DKFO3" 3 "I" ">5.00" "" 21598 "Consolidated data" 8 8 1 8877 . 26631
"RO19" 19 "I" "-" "" 21579 "Unconsolidated data" 8 8 2 1 . 809
"RO19" 19 "I" "16.05" "" 21365 "Unconsolidated data" 8 8 2 808 . 809
"GBSE000027" 27 "I" "10.27" "" 21419 "Unconsolidated data" 8 8 3 229 . 21159
"GBSE000027" 27 "I" "9.82" "" 21419 "Unconsolidated data" 8 8 3 8015 . 21159
"GBSE000027" 27 "I" "9.82" "" 21419 "Unconsolidated data" 8 8 3 8015 . 21159
"GBSE000027" 27 "I" "10.27" "" 21419 "Unconsolidated data" 8 8 3 229 . 21159
"GBSE000027" 27 "I" "19.64" "" 21419 "Unconsolidated data" 8 8 3 1167 . 21159
"GBSE000027" 27 "I" "40.18" "" 21419 "Unconsolidated data" 8 8 3 3504 . 21159
"MTC35" 35 "I" "2.23" "" 20939 "Unconsolidated data" 8 8 4 1226 . 8459
"MTC35" 35 "I" "0.48" "" 20939 "Unconsolidated data" 8 8 4 50 . 8459
"RO35" 35 "I" "23.71" "" 21579 "Unconsolidated data" 8 8 4 1674 . 8459
"MTC35" 35 "I" "6.83" "" 20939 "Unconsolidated data" 8 8 4 5509 . 8459
"GBJE67" 67 "I" "NG" "" 21166 "Consolidated data" 8 8 5 8892 . 44460
"GBJE67" 67 "I" "NG" "" 21166 "Consolidated data" 8 8 5 8892 . 44460
"GBJE67" 67 "I" "NG" "" 21166 "Consolidated data" 8 8 5 8892 . 44460
"GBJE67" 67 "I" "NG" "" 21166 "Consolidated data" 8 8 5 8892 . 44460
"GBJE67" 67 "I" "NG" "" 21166 "Consolidated data" 8 8 5 8892 . 44460
"MTC68" 68 "I" "0.36" "" 21605 "Unconsolidated data" 8 8 6 38 . 76
"MTC68" 68 "I" "0.36" "" 21605 "Unconsolidated data" 8 8 6 38 . 76
"GBSE000075" 75 "I" "19.11" "" 21317 "Consolidated data" 8 8 7 1114 . 15307
"GBSE000075" 75 "I" "20.25" "" 21317 "Consolidated data" 8 8 7 1328 . 15307
"GBSE000075" 75 "I" "NG" "" 21317 "Consolidated data" 8 8 7 8892 . 15307
"GBSE000075" 75 "I" "20.21" "" 21317 "Consolidated data" 8 8 7 1324 . 15307
"GBSE000075" 75 "I" "20.18" "" 21317 "Consolidated data" 8 8 7 1321 . 15307
"GBSE000075" 75 "I" "20.24" "" 21317 "Consolidated data" 8 8 7 1327 . 15307
"GBSE000075" 75 "I" "-" "MO" 21564 "Consolidated data" 8 8 7 1 1128 15307
"GBSE000078" 78 "I" "-" "3.93" 21586 "Consolidated data" 8 8 8 1 604 4782
"GBSE000078" 78 "I" "-" "10.22" 21184 "Consolidated data" 8 8 8 1 175 4782
"GBSE000078" 78 "I" "0.01" "" 21184 "Consolidated data" 8 8 8 3 . 4782
"GBSE000078" 78 "I" "4.82" "" 21264 "Consolidated data" 8 8 8 3468 . 4782
"GBSE000078" 78 "I" "0.02" "" 21184 "Consolidated data" 8 8 8 4 . 4782
"GBSE000078" 78 "I" "-" "1.05" 21457 "Consolidated data" 8 8 8 1 100 4782
"GBSE000078" 78 "I" "0.01" "" 21184 "Consolidated data" 8 8 8 3 . 4782
For example for the firm MTC35 i would like to have just one observation with the total amount of 2.71.
I have already grouped the firms by giving them their individual firm IDs, and used the following 2 codes trying to summarize the observations for each individual observation.
Code:
egen sh_direct1temp = sum(sh_direct1), by(firmid) collapse (sum) sh_direct1, by (firmid)
Thus to re-emphasise i would like have for each individual firm one observation with the total sh_direct1 and sh_total1 (even though in this case this firm doesn't have a percentage for sh_total1) instead of three like is the case with DKF03 or firmid 1 in this case.
Any help would be greatly appreciated, Thank you!
Regards,
Shariff
0 Response to Aggregating observations for different variables based on an ID
Post a Comment