Dear all Statalisters

I am working on an application of an instrumental variable (IV) approach to address unobserved confounding. My main aim at this stage is to understand what estimation method is most suitable and implementation in Stata. I have read a lot of literature on this topic, however, I think input from the Statalist forum could be beneficial. Especially as I see many highly skilled researchers that have published on specifications of IV are active in this community (e.g. excellent work by Jeff Wooldridge Joao Santos Silva and others).

The context is a just-identified application with one endogenous variable and one instrument with the following specifications:
  • Panel data: 7 years on a population of approximately 10 000 patients.
  • Instrument: Continuous (rate, i.e. “preference instrument”), which I have also considered implementing as a set of binary indicators using Statas -i.IV- specification.
  • Treatment: Binary (medication over a given period, yes/no).
  • Outcome: Count/binary (outcome can be defined both ways).
The approach I find most suitable at the moment is a combination of GLM and GMM as outlined in Johnston et al. (2008), which address both binary and count responses (but have not found specific discussion on when the IV is continuous). I also see that Wooldridge comment in #6 here that the FE Poisson estimator is preferable when addressing endogeneity in count panel models, although I am unsure whether this applies to am application with a continuous instrument.

Additionally, I see the Arellano-Bond estimator as a relevant approach. My (currently vague) understanding of the Arellano-Bond estimator is that the lagged versions of the dependent variable itself becomes the instrument, so we would not need an additional instrument.

Hopefully some of you have input on this IV-setup and it's implementation in Stata (I assume a version of -xtivreg- is the way to go).