Dear all,

I am running into an issue that I can use your expertise on. I will write out in this post exactly what I did, and what I want. I hope that is of help.

I am working with a database that contains board information of firms. Meaning that for every firm and year, I have multiple observations, since there are multiple board members. I have done all my calculations, so I do not need the individual rows of directors but I want to aggregate everything to one observation per firm and year. Here is a piece of my database:

Code:
input str12 ISIN int Year double DirectorID float Dummy SumOfDummy
"DK0060014751" 2019 2289205 0 0
"FI4000306733" 2019 2290700 0 0
"MHY1146L1092" 2019 2159476 1 3
"MHY1146L1092" 2019 2372111 1 3
"MHY1146L1092" 2014 2372111 0 0
"MHY1146L1092" 2014 2372111 0 0
"MHY1146L1092" 2019 2372111 1 3
"DE0007765604" 2002   28801 0
What I want is to aggregate the variable Dummy to a firm level. However, I want to count the unique occurrences of the dummy. So, in the case of MHY1146L1092 in the year 2019, I would get a SumOfDummy of 2, since there are two unique DirectorIDs found with a dummy 1. However, In my data above, I have a SumOfDummy of 3, because it does not take the DirectorID into account. To calculate SumOfDummy, I used the following code:

Code:
bysort ISIN Year: egen SumOfDummy = sum(Dummy)
As previously stated, I do not need the DirectorID anymore. So, ultimately, I would want MHY1146L1092 for the year 2019 is only displayed once.

How could I go about this? Please let me know if it is unclear. Then I can clarify it as much as possible.

Thank you for your time and efforts.