I’m using STATA 15 to generate a variable that will be used as an instrument in a regression.
The dataset that I use is on director level, meaning that for each firm in a year the directors of the board are unique observations. (note that the directors in the sample don’t need to add up to the board size, since executives are excluded, for example)
Below is a subsection of the dataset.
cusip_9 | fyear | directorid | gender | board_size | fraction_female_directos |
63904106 | 2000 | 323732 | M | 9 | 0.111 |
63904106 | 2001 | 323712 | M | 10 | 0.1 |
63904106 | 2001 | 323719 | M | 10 | 0.1 |
63904106 | 2001 | 323721 | M | 10 | 0.1 |
63904106 | 2001 | 323722 | F | 10 | 0.1 |
63904106 | 2001 | 323728 | M | 10 | 0.1 |
63904106 | 2001 | 324051 | M | 10 | 0.1 |
63904106 | 2001 | 324056 | M | 10 | 0.1 |
63904106 | 2002 | 323713 | M | 11 | 0.182 |
63904106 | 2002 | 323722 | F | 11 | 0.182 |
63904106 | 2002 | 323726 | M | 11 | 0.182 |
63904106 | 2002 | 323728 | M | 11 | 0.182 |
63904106 | 2002 | 323734 | F | 11 | 0.182 |
63904106 | 2002 | 324056 | M | 11 | 0.182 |
63904106 | 2003 | 323716 | M | 11 | 0.182 |
63904106 | 2003 | 323719 | M | 11 | 0.182 |
63904106 | 2003 | 323721 | M | 11 | 0.182 |
So, in the data a director can sit on multiple boards(firms) in a year. To break it down, I want to know of the directors that sit on other boards if there are females on those other boards. Secondly making the fraction of male directors in a board that sit on other boards in a year that have a female on the board. It is important that this fraction doesn’t incorporate females on the board in question for which I want to calculate this connection fraction.
The code that I used so far is the following, and for a fact wrong and inefficient
Code:
//generating variable to indicate on how many other boards a directors sits in the sample. sort directorid fyear quietly by directorid fyear: gen other_boards=cond(_N==1,0,_n) label variable other_boards "# of boards a directors sits on in the sample" //generating other_boards only for men gen other_boards_m = 0 replace other_boards_m = 1 if (gender=="M" & other_boards>1) //generating instrument as the fraction of male directors on the board who sit on other boards on which there are female directors //frist generating connection and later the fraction gen connection_female = 0 replace connection_female = 1 if (firm_has_female_directors==1 & other_boards_m>=1) //fraction of male directors on the board who sit on other boards on which there are female directors gen fraction_connection_female = (connection_female/board_size) label variable fraction_connection_female "fraction of male directors on the board who sit on other boards on which there are female directors"
Someone mentioned that the egen command could be far easier to compute what I'm after. Unfortunately, I have no idea how, since my stata knowledge is not that great.
Hopefully someone can help me.
0 Response to Problem with generating an instrumental variable
Post a Comment