Hi everyone,

I'm analyzing a panel dataframe with T = 17 and 42 groups for a total of about 700 observations. I have performed an ADL analysis and would like to perform a difference GMM analysis as this is frequent in the literature of my field and I suspect reverse causality.

My dependent variable is y. The variable of interest is x (assumption: x is endogenous) and z1 to z4 are control variables. The model I fitted for the ADL regression contains 2 lags of y, 5 lags of x, year dummies and the control variables.

My GMM follows the suggestions by Roodman (2009) [1]:

xtabond2 y L(1/2).y L(1/5).x i.year z1 z2 z3 z4, iv(z1 z2 z3 z4) gmm(x, laglimit(6 .) collapse)
> gmm(y, laglimit(3 .) collapse equation(diff)) level(95) robust nolevel small
My first question is if I should enter L(1/5).x as a GMM instrument as
gmm(x, laglimit(6 .) collapse)
([1] says that predetermined not strictly exogenous the lag should start 1 after 5) or as iv instrument
iv(L(1/5).x, equation(level))
as suggested by the xtabond2 documentation for predetermined variables.

[1] says that
gmm(x, laglimit(6 .) collapse)
is the way to go, but the second option is more robust. Any suggestions on this?

Secondly, the GMM estimation is very sensitive to small changes in the model. For example the equation above generates the result at the end of the post which has completely unusable confidence intervals and estimations.
Reducing lags and removing collapse, i.e.
 xtabond2 y L(1/2).y L(1/5).x i.year z1 z2 z3 z4, iv(z1 z2 z3 z4) gmm(x, laglimit(6 8))
> gmm(y, laglimit(3 5) equation(diff)) level(95) robust nolevel small
has better confidence intervals but the tests are completely unusable with many p values equal to 1.000. The instrument count here is 82.

The model with
iv(L(1/5).x, equation(level))
has on the other hand more similar results to the ADL model I fitted.

I would be very interested in feedback on the model and whether my results indicate that the GMM is too sensitive to the model specification to be used for my dataset.

Thank you very much in advance!


. xtabond2 y L(1/2).y L(1/5).x i.year z1 z2 z3 z4, iv(z1 z2 z3 z4) gmm(x, laglimit(6 .) collapse)
> gmm(y, laglimit(3 .) collapse equation(diff)) level(95) robust nolevel small
Favoring speed over space. To switch, type or click on mata: mata set matafavor space, perm.
2000b.year dropped due to collinearity
2001.year dropped due to collinearity
2002.year dropped due to collinearity
2003.year dropped due to collinearity
2004.year dropped due to collinearity
2009.year dropped due to collinearity
2018.year dropped due to collinearity
2019.year dropped due to collinearity
Warning: Two-step estimated covariance matrix of moments is singular.
  Using a generalized inverse to calculate robust weighting matrix for Hansen test.
  Difference-in-Sargan/Hansen statistics may be negative.

Dynamic panel-data estimation, one-step difference GMM
Group variable: rec_num                         Number of obs      =       477
Time variable : year                            Number of groups   =        43
Number of instruments = 31                      Obs per group: min =         0
F(0, 43)      =         .                                      avg =     11.09
Prob > F      =         .                                      max =        12
                    |               Robust
            y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            y |
                L1. |   1.781565   .1623636    10.97   0.000     1.454128    2.109002
                L2. |  -.7983964   .2120962    -3.76   0.001    -1.226129   -.3706637
x |
                L1. |   .0680324   .5368132     0.13   0.900    -1.014555    1.150619
                L2. |  -.3129848   .2357713    -1.33   0.191    -.7884629    .1624934
                L3. |  -.2442823   .3147541    -0.78   0.442    -.8790444    .3904798
                L4. |  -.0797133   .3386423    -0.24   0.815    -.7626505     .603224
                L5. |   .0101173   .0706096     0.14   0.887    -.1322806    .1525152
               year |
              2005  |  -.9880184   .8313964    -1.19   0.241    -2.664689    .6886523
              2006  |  -.7845101   .6621171    -1.18   0.243    -2.119796    .5507762
              2007  |   -.146091   .3084817    -0.47   0.638    -.7682036    .4760216
              2008  |  -.1693875   .1717921    -0.99   0.330    -.5158393    .1770643
              2010  |  -.2693423   .2671217    -1.01   0.319    -.8080445    .2693598
              2011  |   .1250097   .4186119     0.30   0.767    -.7192016     .969221
              2012  |   .1334546   .4359198     0.31   0.761    -.7456615    1.012571
              2013  |   .0319735   .4946734     0.06   0.949    -.9656306    1.029578
              2014  |  -.0193052    .612934    -0.03   0.975    -1.255404    1.216794
              2015  |  -.0516131   .6841706    -0.08   0.940    -1.431375    1.328148
              2016  |  -.0693695   .8155355    -0.09   0.933    -1.714054    1.575315
              2017  |  -.2342953   .8617778    -0.27   0.787    -1.972236    1.503645
                z1 |  -.0068974   .0061143    -1.13   0.266     -.019228    .0054332
            z2 |   99.51894   68.89719     1.44   0.156    -39.42549    238.4634
    z3 |  -99.22053   68.87042    -1.44   0.157     -238.111     39.6699
      z4 |  -.0006517   .0016489    -0.40   0.695     -.003977    .0026737

Arellano-Bond test for AR(1) in first differences: z =  -1.62  Pr > z =  0.106
Arellano-Bond test for AR(2) in first differences: z =   0.07  Pr > z =  0.944
Sargan test of overid. restrictions: chi2(8)    =  22.91  Prob > chi2 =  0.003
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(8)    =   5.88  Prob > chi2 =  0.661
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  iv(edu log_pop log_pop_density maternal_rate)
    Hansen test excluding group:     chi2(4)    =   2.10  Prob > chi2 =  0.717
    Difference (null H = exogenous): chi2(4)    =   3.78  Prob > chi2 =  0.437