I have an unbalanced panel data of companies in a country and their corresponding regional/ industrial variables including population, industry employment share in the region, the number of industry (diversity), competition and two control variables (1 at company level, 1 at regional level).
The variables of interest is two first one.
I'm trying to regress company's productivity (measured as company's total factor productivity) on regional/ industrial variables.
I prepared for the regression like this:
HTML Code:
xtset id year, yearly panel variable: id (unbalanced) time variable: year, 2011 to 2016, but with gaps delta: 1 year
The basic results for RE model using GLS technique with Stata 15.1 is as follows: (I removed results for year dummies for the sake of space).
HTML Code:
xtreg lnProductivity lnPopulation IndustryShare Diversity Competition Control1 Control2 i.year, re vce(cluster Region_Industry) Random-effects GLS regression Number of obs = 82,557 Group variable: id Number of groups = 28,722 R-sq: Obs per group: within = 0.0507 min = 1 between = 0.0404 avg = 2.9 overall = 0.0449 max = 6 Wald chi2(11) = 694.47 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 978 clusters in Region_Industry) Robust lnProductiv~y Coef. Std. Err. z P>z [95% Conf. Interval] lnPopulation .1219238 .027064 4.51 0.000 .0688793 .1749683 IndustryShare .060369 .0277869 2.17 0.030 .0059078 .1148302 Diversity .0668951 .0376893 1.77 0.076 -.0069745 .1407647 Competition -.0721706 .0303292 -2.38 0.017 -.1316148 -.0127265 Control1 .0002489 .0001706 1.46 0.145 -.0000855 .0005833 Control2 -.0003625 .0003653 -0.99 0.321 -.0010785 .0003536
HTML Code:
xtreg lnProductivity lnPopulation IndustryShare Diversity Competition Control1 Control2 i.Industry i.year, re vce(cluster Region_Industry) Random-effects GLS regression Number of obs = 82,557 Group variable: id Number of groups = 28,722 R-sq: Obs per group: within = 0.0515 min = 1 between = 0.5351 avg = 2.9 overall = 0.5430 max = 6 Wald chi2(69) = 15884.90 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 978 clusters in Region_Industry) Robust lnProductiv~y Coef. Std. Err. z P>z [95% Conf. Interval] lnPopulation .1224042 .0080391 15.23 0.000 .1066478 .1381606 IndustryShare .0442533 .0140208 3.16 0.002 .0167729 .0717336 Diversity .0323125 .0091825 3.52 0.000 .0143152 .0503098 Competition -.0450796 .0103554 -4.35 0.000 -.0653757 -.0247835 Control1 .0003052 .0001753 1.74 0.082 -.0000384 .0006488 Control2 -.0007511 .0005478 -1.37 0.170 -.0018248 .0003225 Industry 102 .1498892 .0985636 1.52 0.128 -.0432918 .3430703 103 .1355773 .1181954 1.15 0.251 -.0960815 .367236 104 .3746264 .1751559 2.14 0.032 .0313272 .7179256 105 .4152645 .2612369 1.59 0.112 -.0967504 .9272794 106 .1885796 .1911881 0.99 0.324 -.1861422 .5633014 107 .0894018 .1089424 0.82 0.412 -.1241214 .3029251 108 .5530862 .1164927 4.75 0.000 .3247646 .7814078 110 -1.04191 .1067401 -9.76 0.000 -1.251117 -.8327036
As you can see, the overall R-squared goes up to above 0.5, much bigger than 0.05 in the previous regression.
So, I'm quite confused the reason behind this jump and wondering whether I should put industry fixed-effect into the RE model?
Thank you very much in advance for your time and advice!
0 Response to R-squared rockets when adding a categorical vairable in the Random Effect model
Post a Comment