I'm trying to conduct FE-model test on my panel data set. The panel consists of 116 companies, observed over 5 years. Within these companies I try to measure different characteristics of different managers (ex CEO) as independent variables and a set of control variables to see the effect on leverage (dependent variable). The 116 companies have different amount of managers presented, as to why I want to run separate regressions for each manager.
Furthermore, the variables are coded as following:
Dependent:
Leverage: as per average leverage for each year
Independent:
Gender: 0 Male, 1 Female (dummy)
Age: as per age by measuring point
Education: 1-4 depending on educational level
Experience: 0 if no experience, 1 if experience (dummy)
Tenure: as per tenure by measuring point
Misdem: 0 if none, 1 if (dummy)
Control variables:
Industry: 0-9 depending on type of industry (dummy) - unfortunately presented below are my old tests, showing industry as 1-10.
ROA: measured as %
Firm size: Ln(sales)
As noticed, across my 5 year time-period, there are many time-invariant variables (at least in regards of my panel), for example the Education variable does not vary (if the manager has educational level "2", this will not vary in my panel). All the dummy variables are also time invariant for each manager.
Investigating into how panel data regressions should be run, they can (as to what I understand is mostly common) be run using a Pooled OLS technique or panel data regression using either a fixed effects (fe) or random effects (re) model. As I also understood, panel data regressions are superior to Pooled OLS regressions, which leaves me with the choice between fe or re panel regression. From what I understood further, whether fe or re should be used is determined by a number of factors but mainly using a Hausman-test. To conduct the Hausman-test both regressions, fe and re, are run and then somehow compared to determine which is most fitting ones data - significant at 5% level means that fe should be prefered over re(?).
Anyway, to get to the problem..
I set up my panel using:
Code:
egen companynum = group(Company) xtset companynum xtset companynum Years, yearly
Code:
xtreg dep indep1 indep2 indep3 indep4 indep5 indep6 cont1 cont2 cont3 cont4 cont5, fe vce(robust)
Code:
xtreg Leverage i.CEOGen CEOAge i.CEOEdu i.CEOExp CEOTen CEOMis ROA i.Industry FirmSize, fe vce(robust)
I get the following result:
Array
From what I understood this problem (omitted variables) is because fe already accounts for time-invariance in the regression, which is why time-invariant variables don't work, or get knocked out/omitted regressions using fe..
Which presents my questions:
1. Is this possible to fix so I can run a fe regression followed by a Hausman-test? - or should I choose the re model / pooled ols anyway, even if probably Hausman-test would probably say that fe regression is prefered, because of the impossibility of running this data set as fe regression?
2. Is my regression correct in terms of using "i." on Education (CEOEdu variable) and having it coded as 1-4?
3. Is my regression and reasoning correct in general, have I missed any steps in regards of the preparation of conducting this type of regression?
4. Not related to the regression or problem - is there a simple way to get the regression from STATA into some form of presentation type or other document type?
Important to add is that my knowledge about statistics (and STATA) is highly limited and time is limited to acquire knowledge, hence I have turned to this forum of great expertise for help.
I would very much appreciate fast help with how I should command my regression to get the correct output for my reporting of the results and to analyze the results. An explanation to why one regression is used instead of another with regards to my data set, if this is the case, would also be highly appreciated.
Thank you in advance and best regards,
0 Response to Panel data with fixed effects model having dummy/time-invariant variables in data set
Post a Comment