Dear Dr. Kripfganz,

Following your suggestions in the previous posts I decided to use xtdpdgmm command. It enables me to get meaning of each options that I specify in the command line, unlike the other GMM commands. I constructed my model with reference to I would like to ask whether I use the command in an appropriate way which would also help other researchers to implement command.

I have a panel data with n=769, T=12. The data contains information about individuals’ depression level(dep_score), average income(avg_inc), health status(health_st), and injuries(injury) in a monthly basis and time dummies(w_*) for each month. Individualid variable is identifier for individuals.

I want to understand how income, health status and injuries affect the depression level. Since depression status is roughly stable, I used first lag of depression score as an independent variable. Moreover I use dynamic model as depression could affect average income and health status in the subsequent period. Based on your previous explanations I assume that average income and health status are predetermined variables. Injuries and time dummies are exogenous variables in my model.

I ran a regression with xtdpdgmm. I used vce option for Windmeijer’s correction. small options stands for getting t values instead of z. two option used to estimate two-step GMM model. collapse is for decreasing the number of instruments.

The command I ran was:
Code:
Code:
xtdpdgmm L(0/1).dep_score avg_inc health_st injury w_*, model(diff) collapse gmm(dep_score, l(2 4)) gmm(avg_inc health_st, l(1 3))  gmm(injury w_*, l(0 2)) nocons  two small vce(cl individualid)
1. How can I test whether average income and health status are endogenous or predetermined variables? Once I define them as endogenous variables rather than predetermined variables, p value of AR(1)=0 in both model, p value of AR(2) increases to 0.81 from 0.20 and p value of Hansen test stays same 0.353 vs. 0.352. Fitting full model step(1) decreased to 0.13 from 0.15.
Which statistics should I take into account when deciding true specification of variables?

2. Exogenous injury and time variables are control variables in my model. Should I use them with gmm() or iv()?

3. Do I need to specify m(diff) m(level) to inside of all gmm commands? I did not get what's meaning of level equation and difference equation. Once I use m(l) in gmm(injury w_*, l(0 2) m(l)), fitting full model step(1) increased to 0.35 from 0.15. When do I need to use instruments in level rather than difference, how can I decide?

4. What's the role of nocons option?

5. Once I run the same model with model(fodev) lag of dependent variable become insignificant. t value to decreases to 0.84 from 2.80 while # of obs. stays same. On the other hand, time dummies become significant. What could be the reason? p value of AR(1)=0 in both model, p value of AR(2) increases to 0.83 from 0.20 and p value of Hansen test increases to 0.70 from 0.352. Fitting full model step(1) increased to 0.23 from 0.15.

6. Once I run the same model with model(fodev) lag of dependent variable become insignificant. t value to decreases to 38.6 from 2.80 while # of obs. stays same. On the other hand, time dummies become significant. What could be the reason? p value of AR(1)=0 in both model, p value of AR(2) decreases to 0.0004 from 0.20 and p value of Hansen test decreases to 0.009 from 0.352. Fitting full model step(1) increased to 1.13 from 0.15. How can I decide between fodev, level and diff models?

7. Why Stata drops two time dummies in diff model? It drops only one while using system GMM. Can I specify which dummies should be dropped if I need info for particular time dummy?

Thank you for your great help to all researchers in this forum.

Best regards,
John