I have learned a lot on the Statalist forum but I am not running into something that I could use your expertise on. I think I can manage to do what I want in Excel, but since I have many observations and I want to learn how it works in Stata I am asking it here.
I am doing a study of which one part consists of the structure of boards. For this, I am using BoardEx. From BoardEx, I have exported three files, which together contain the information I need. I will be talking about the three files below. The main goal of what I am trying to achieve is to calculate the ratio of female directors in a given firm-year. I want to do this by dividing the number of females in a given firm-year by the total board size.
In the first file, the genders of the board members are stated per DirectorID. Here is a short preview.
Code:
input str21 Title str30 Forename1 str11 DOB str3 Age str1 Gender str36 Nationality long DirectorID int NetworkSize "Admiral" "Anna" "Jan 1972" "47" "F" "" 1530600 68 "Director" "Aziz" "1966" "54" "M" "" 1356558 245 "Admiral" "Bobby" "04 Apr 1931" "89" "M" "American" 33004 4323 "Chairman" "John" "10 Jul 1953" "66" "M" "French" 1381271 120 end
Code:
input str64 DirectorName str128 CompanyName str85 Qualification long(DirectorID CompanyID) int AwardDate "Doctor Christopher Albrecht" "Universitat Basel (University of Basel)" "Graduated" 39 62183 . "Doctor Christopher Albrecht" "Universitat Basel (University of Basel)" "PhD" 39 62183 1461 "John Loudon" "Yale University" "BA" 52 62981 -1095 "John Loudon" "Université Paris Sorbonne - Paris IV (Paris Sorbonne University - Paris IV)" "Graduated" 52 63794 . end format %tdnn/dd/CCYY AwardDate
Code:
input long(BoardID DirectorID) str4 Year str12 ISIN byte NumberDirectors 1990357 31 "2016" "JE00BD3QJR55" 3 1990357 31 "2015" "JE00BD3QJR55" 3 1990357 31 "2014" "JE00BD3QJR55" 3 1990357 31 "2017" "JE00BD3QJR55" 3 1990357 31 "2019" "JE00BD3QJR55" 4 1990357 31 "2018" "JE00BD3QJR55" 3 17834 36 "2002" "IE0004906560" 18 17834 36 "2006" "IE0004906560" 19 17834 36 "2004" "IE0004906560" 19 17834 36 "2011" "IE0004906560" 15 17834 36 "2005" "IE0004906560" 19 17834 36 "2010" "IE0004906560" 15 17834 36 "2007" "IE0004906560" 18 17834 36 "2003" "IE0004906560" 19 17834 36 "2008" "IE0004906560" 15 17834 36 "2009" "IE0004906560" 15 32910 37 "2002" "DE0007664005" 28 32910 37 "2002" "DE0007664039" 28 3447 39 "2002" "CH0012410517" 12 20144 42 "2003" "IT0000062957" 23 8678 42 "2003" "FR0000120644" 13 15520 42 "2002" "IT0001353173" 15 end
Code:
variable DirectorID does not uniquely identify observations in the master data
So, my question is: how can I realize this? And how can I then calculate the female ratio (number of females/number of directors)?
Thank you for your time and efforts. If something is not clear please let me know. If I can in any way improve my future posts I would gladly like to hear that as well.
0 Response to Lost in the proces of matching data
Post a Comment