I am analyzing a balanced panel of around 2400 firms over 12 years (Stata 13). The output I am able to present here is based on test data, as I am not allowed (or able to) extract the original files. The only difference is the number of firms, which is higher in the original dataset, and that most of my explanatory variables turn out to be significant, unlike in this sample data. F-statistic in the original is F(11,13432) Prob>F 0.0000, R-sq. overall is 0.9639.
My goal is to analyze the effect of investments in computer (investict), product and process innovations on the demand for highskilled workers. Controls include the size of the firm in terms of employees (total), the industry, a dummy for West Germany (west), a dummy for a collective bargaining agreement (collective), the state of the art of production equipment (tech) and if the firm deals with RnD, and some more.
I have used xtserial and xttest3 which have lead me to include clustered robust standard errors. Using xtoverid,made me decide to use fixed effects. -testparm- has made me include year fixed effects. So my regression is now:
Code:
xtreg highskill investict product_inno process_inno total west industry collective exportshare investment turnover rnd t
> ech i.year, fe vce(cluster idnum)
note: west omitted because of collinearity
Fixed-effects (within) regression Number of obs = 4344
Group variable: idnum Number of groups = 498
R-sq: within = 0.1005 Obs per group: min = 1
between = 0.5034 avg = 8.7
overall = 0.4393 max = 11
F(21,497) = 2.60
corr(u_i, Xb) = 0.3892 Prob > F = 0.0001
(Std. Err. adjusted for 498 clusters in idnum)
------------------------------------------------------------------------------
| Robust
highskill | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
investict | .7032893 .2711382 2.59 0.010 .170571 1.236008
product_inno | .2723859 .6988765 0.39 0.697 -1.100731 1.645503
process_inno | -.3938082 .4501978 -0.87 0.382 -1.278334 .4907173
total | .101938 .0245108 4.16 0.000 .0537805 .1500954
west | 0 (omitted)
industry | .1624997 .1911486 0.85 0.396 -.2130592 .5380586
collective | -.2838042 .5861356 -0.48 0.628 -1.435413 .8678049
exportshare | .8483747 2.351452 0.36 0.718 -3.771638 5.468387
investment | 1.44e-06 5.98e-07 2.41 0.016 2.68e-07 2.62e-06
turnover | -1.99e-07 1.39e-07 -1.43 0.153 -4.73e-07 7.46e-08
rnd | -1.103514 .9824249 -1.12 0.262 -3.033732 .8267042
tech | -.6756037 .2828397 -2.39 0.017 -1.231313 -.1198947
|
year |
2008 | .0310991 .3815399 0.08 0.935 -.7185309 .7807291
2009 | .4981931 .3197414 1.56 0.120 -.1300184 1.126405
2010 | .7890588 .4913133 1.61 0.109 -.1762483 1.754366
2011 | 1.109093 .5630923 1.97 0.049 .0027585 2.215428
2012 | 1.189345 .5407669 2.20 0.028 .126874 2.251816
2013 | .0965383 .7094676 0.14 0.892 -1.297387 1.490464
2014 | .4120097 .6609871 0.62 0.533 -.8866637 1.710683
2015 | -.1867301 .7267681 -0.26 0.797 -1.614647 1.241187
2016 | .1137137 .5447759 0.21 0.835 -.956634 1.184061
2017 | -.4267298 .7349041 -0.58 0.562 -1.870632 1.017172
|
_cons | 4.706464 2.350515 2.00 0.046 .0882924 9.324636
-------------+----------------------------------------------------------------
sigma_u | 22.632204
sigma_e | 7.5596268
rho | .89962854 (fraction of variance due to u_i)
------------------------------------------------------------------------------I originally intended to use the share of highskilled employees as my dependent variable, but after reading the paper of Kronman (1993) and several posts in this forum concerning the problems with ratios, I have switched to using the absolute number of highskilled employees (highskill) and include the total number of employees as a control. This has increased my R-squared by a lot (it was only 0.016 before).
On the other hand, I tested my model specification using:
Code:
predict fitted, xb g sq_fitted=fitted^2 xtreg highskill fitted sq_fitted test sq_fitted
Also I don't understand why the dummy for west would be omitted, none of the regressors are highly correlated.
I have read many posts in this forum and run several tests that made me end up with this fixed effects regression model, so I am confused about the result of the specification test. I have also tried -areg-, absorb(idnum) vce(cluster idnum), which has slightly different coefficients and a higher R-Sq. (as is normal) than the -xtreg, fe- but it has the same result in the misspecification test.
Testing for normality using
Code:
xtreg highskill investict product_inno process_inno total west industry collective exportshare investment turnover rnd tech, re vce(cluster idnum)
Code:
xtsktest
(running _xtsktest_calculations on estimation sample)
Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
Tests for skewness and kurtosis Number of obs = 4344
Replications = 50
(Replications based on 498 clusters in idnum)
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Skewness_e | -1805.438 1230.613 -1.47 0.142 -4217.396 606.5195
Kurtosis_e | 456552.4 194447.7 2.35 0.019 75441.97 837662.8
Skewness_u | 12182.3 2960.393 4.12 0.000 6380.038 17984.56
Kurtosis_u | 1510700 274557.2 5.50 0.000 972577.4 2048822
------------------------------------------------------------------------------
Joint test for Normality on e: chi2(2) = 7.67 Prob > chi2 = 0.0217
Joint test for Normality on u: chi2(2) = 47.21 Prob > chi2 = 0.0000
------------------------------------------------------------------------------I appreciate any input on my issues, thanks in advance,
Helen
No comments:
Post a Comment