I am using interrupted time series methods on household panel data. I have monthly data on about 2000 households over a three-year period (18 months before the event, 18 months after). The panel is unbalanced, with a mean follow-up of 29 months. I construct my counterfactual (extrapolate the pre trends into the post period) using a linear FE model controlling for the time period (before/after the event), a linear time trend (with an interaction with the time period to allow for a change in the trend), month dummies and a set of additional covariates.
I did a model building exercise to better understand the difference between the purely descriptive pre vs. post mean difference and the adjusted mean difference (between fitted and counterfactual values in the post period).
Here is the issue. When I fit a random effects model with xtreg, re vce(cluster panelvar) only controlling for the time period and a linear time trend, the Hansen’s J test of overidentifying restriction for no correlation between the panel effects and the regressors (using xtoverid) fails to reject (Chi2(2) = 0.15, p = 0.9276), as expected. However, adding the interaction between the time period and the trend, the test now strongly rejects (Chi2(3) = 14.543, p = 0.0023). The coefficient estimates are slightly different than the FE estimates, but xtreg, fe vce(robust) yields corr(u_i, Xb) = 0.0053, essentially the same as when the interaction is excluded (0.0054). Also, the estimation sample is the same throughout.
How can the test reject when (1) the additional variable (the interaction) is conceptually unrelated to the panel effects, and (2) corr(u_i, Xb) is basically the same across the two FE models (with and without the interaction)?
Since my panel is unbalanced, I wondered if it could be due to attrition and replenishment of the sample. When I restrict the estimation sample to households with complete follow-up (about 60% of households), the Hansen’s J test rejects with the period indicator the only regressor (Chi(2) = 109.351), even though xtreg, fe vce(robust) yields virtually the same coefficient estimate and corr(u_i, Xb) = 0 (as expected).
Am I missing something obvious? Or is this telling me something about the data?
I am doing this exercise because household composition has a sizable impact on the fitted vs. counterfactual mean difference when it is included in the model alongside the post * t interaction, but not when the interaction is omitted, and I am not sure why.
Any idea on why I am getting these results or on what I might do next to find out would be greatly appreciated.
Maxime
Code:
. xtreg y post t, re vce(cluster id) Random-effects GLS regression Number of obs = 69,494 Group variable: id Number of groups = 2,383 R-sq: Obs per group: within = 0.0078 min = 1 between = 0.0006 avg = 29.2 overall = 0.0046 max = 36 Wald chi2(2) = 174.10 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 2,383 clusters in id) ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- post | -32.61146 3.672993 -8.88 0.000 -39.81039 -25.41252 t | -.4647523 .2322854 -2.00 0.045 -.9200233 -.0094813 _cons | 480.6522 6.420681 74.86 0.000 468.0679 493.2365 -------------+---------------------------------------------------------------- sigma_u | 231.44268 sigma_e | 228.77659 rho | .50579289 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xtoverid Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(id) Sargan-Hansen statistic 0.150 Chi-sq(2) P-value = 0.9276 . xtreg y post t, fe vce(cluster id) Fixed-effects (within) regression Number of obs = 69,494 Group variable: id Number of groups = 2,383 R-sq: Obs per group: within = 0.0078 min = 1 between = 0.0006 avg = 29.2 overall = 0.0046 max = 36 F(2,2382) = 86.54 corr(u_i, Xb) = 0.0054 Prob > F = 0.0000 (Std. Err. adjusted for 2,383 clusters in id) ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- post | -32.56479 3.672894 -8.87 0.000 -39.76719 -25.36239 t | -.4696601 .2338195 -2.01 0.045 -.9281709 -.0111493 _cons | 489.016 3.468408 140.99 0.000 482.2146 495.8174 -------------+---------------------------------------------------------------- sigma_u | 239.63657 sigma_e | 228.77659 rho | .52317216 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xtreg y post t post_t, re vce(cluster id) Random-effects GLS regression Number of obs = 69,494 Group variable: id Number of groups = 2,383 R-sq: Obs per group: within = 0.0078 min = 1 between = 0.0004 avg = 29.2 overall = 0.0046 max = 36 Wald chi2(3) = 174.64 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 2,383 clusters in id) ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- post | -44.64032 8.956723 -4.98 0.000 -62.19518 -27.08547 t | -.7876857 .3423413 -2.30 0.021 -1.458662 -.116709 post_t | .6493272 .4381844 1.48 0.138 -.2094984 1.508153 _cons | 483.5995 6.881025 70.28 0.000 470.1129 497.086 -------------+---------------------------------------------------------------- sigma_u | 230.67539 sigma_e | 228.77046 rho | .50414608 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xtoverid Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(id) Sargan-Hansen statistic 14.543 Chi-sq(3) P-value = 0.0023 . xtreg y post t post_t, fe vce(cluster id) Fixed-effects (within) regression Number of obs = 69,494 Group variable: id Number of groups = 2,383 R-sq: Obs per group: within = 0.0078 min = 1 between = 0.0003 avg = 29.2 overall = 0.0046 max = 36 F(3,2382) = 57.80 corr(u_i, Xb) = 0.0053 Prob > F = 0.0000 (Std. Err. adjusted for 2,383 clusters in id) ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- post | -46.1116 8.987428 -5.13 0.000 -63.73559 -28.48761 t | -.832746 .3437206 -2.42 0.015 -1.506768 -.1587235 post_t | .7309716 .4394493 1.66 0.096 -.130771 1.592714 _cons | 492.4736 4.386992 112.26 0.000 483.8709 501.0763 -------------+---------------------------------------------------------------- sigma_u | 239.70379 sigma_e | 228.77046 rho | .52332547 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xtreg y post if T_i == 36, re vce(cluster id) Random-effects GLS regression Number of obs = 49,572 Group variable: id Number of groups = 1,377 R-sq: Obs per group: within = 0.0000 min = 36 between = 0.0000 avg = 36.0 overall = 0.0041 max = 36 Wald chi2(1) = 109.35 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 1,377 clusters in id) ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- post | -40.97813 3.918698 -10.46 0.000 -48.65864 -33.29763 _cons | 498.4042 6.921887 72.00 0.000 484.8376 511.9709 -------------+---------------------------------------------------------------- sigma_u | 225.27162 sigma_e | 223.84015 rho | .50318729 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xtoverid Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(id) Sargan-Hansen statistic 109.351 Chi-sq(1) P-value = 0.0000
0 Response to Panel data: xtoverid rejects RE for model with only time controls; odd result or information?
Post a Comment