Hello dear Statalist,

I am quite new with Stata and I could really need help for my thesis.

I have variables called "cusip" (firm code), "mgrno" (numerical code for investor (unique)), "rdate" (date), "typecode" (classification coding from 1-5 based on investor type), "shares" (number of shares held by each investor in a firm), and shrout2 (total shares for each firm in 1000).

However, some investors report their holdings multiple times a year (as can be seen from below), so if I would try to sum all the shares based on typecode for each firm, many would appear multiple times since they are reported quarterly. Not all however, so how can I do this? If I would like to get the total ownership for each type of investor for every firm for the latest date the ownership has been reported for example.

In the end, I would like to have a database of the ownership % (shares held by each type in each firm/total shares for firm) for each of the 5 investor types for each firm for that year.

Could someone help me?

When I tried dropping rdate duplicates based on mgrno and cusip, I lost necessary observations.


Array