I have just started my PhD and am quite new to stata and could use some guidance on the choice of the correct model for my data.
Generally speaking, I am analyzing whether there is an effect of top management team-characteristics on a certain external event. Both, the dependent as well as the key independent variable are binary variables, whereas most controls are continuous variables. The data contains around 23,000 firm-year observations for a 15 year time frame. Each firm-year observation includes data on CEO and CFO characteristics (e.g. gender, age, salary) and firm fundamentals. The data looks something like this:
firmid | ceoid | year | cfoid | dependent variable (y) | independent variable_ceo (x_ceo) | independent variable_cfo (x_cfo) | control1 |
1 | 1 | 2002 | 1 | 1 | 1 | 1 | 2151 |
2 | 2 | 2002 | 2 | 1 | 0 | 1 | 2341 |
3 | 3 | 2002 | 3 | 0 | 1 | 1 | 212 |
1 | 1 | 2003 | 4 | 1 | 1 | 0 | 131 |
2 | 2 | 2003 | 2 | 0 | 1 | 0 | 14245 |
Code:
meprobit y L.x_ceo L.x_cfo L.controls || firmid:, vce(robust) intpoints(12)
- Which model would you recommend for the data at hand? Should I use a logit instead of the mixed-effects probit model with time- and firm-fixed effects? Or revert back to an OLS regression?
- Do I need to account somehow for the fact that both, the main independent as well as the dependent variable are binary?
- Do I need to rather cluster standard errors at the management level than on the firm level given that firm-year-observations are not independent for each CEO/CFO combination? If so, how would I include clustered standard errors at the management level if my panel is set with year firmid as the panel variables?
Thank you already for your help and please do let me know in case I need to further specify anything in my post (this might very well be the case given that I am new to this community)!
0 Response to (Quasi-)Three-dimensional panel data with binary depedent and independent variables
Post a Comment