I experience some serious confusion while working on my model due to the fact, that I'm not sure whether the 'things' I do are right and in proper order. That being said, I'd like to ask for Your help, hoping it will clarify my doubts and prevent from methodological mistakes.
My model aims to explain the variability of effective tax rates (ETR) with firm characteristics (company-specific financial-statement-based data), controlling for year and industry as it is commonly practiced in literature. There are 450 companies, 2975 observations of ETR (alternative formulation of ETR: 1608 obs) in the time range of 2004-2016; each company has min. 3 observations and my panel is strongly unbalanced.
First I want to decide, whether it is FE or RE model, that should be used. I compute both, test the significance of dummies and then (after dropping industry) test with Hausman, in favour of FE, (at least assuming 10% level of significance, but so I did when keeping time dummies in).
xtset Company YEAR, yearly
xtreg TotalETR SIZE LEVERAGE ROA INTNG CAPINT INVINT i.YEAR, fe testparm i(2005/2016).YEAR F( 12, 2572) = 1.56 Prob > F = 0.0959 estimates store fixed
xtreg TotalETR SIZE LEVERAGE ROA INTNG CAPINT INVINT i.YEAR i.INDUSTRY, re testparm i(2/8).INDUSTRY chi2( 7) = 4.85 Prob > chi2 = 0.6779 xtreg TotalETR SIZE LEVERAGE ROA INTNG CAPINT INVINT i.YEAR, re testparm i(2005/2016).YEAR chi2( 12) = 18.90 Prob > chi2 = 0.0909
hausman fixed ., sigmamore Test: Ho: difference in coefficients not systematic chi2(18) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 26.63 Prob>chi2 = 0.0862
xttest3 Modified Wald test for groupwise heteroskedasticity in fixed effect regression model H0: sigma(i)^2 = sigma^2 for all i chi2 (385) = 2.2e+06 Prob>chi2 = 0.0000 xtserial TotalETR SIZE ROA LEVERAGE CAPINT INVINT INTNG Wooldridge test for autocorrelation in panel data H0: no first-order autocorrelation F( 1, 340) = 13.334 Prob > F = 0.0003
xtreg TotalETR SIZE LEVERAGE ROA INTNG CAPINT INVINT Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13, vce (cluster Company) re xtoverid Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(Company) Sargan-Hansen statistic 22.983 Chi-sq(18) P-value = 0.1912
