Hello everybody,

In my research, the dependent variable (DV) is count data and endogenous variable (EV) is binary with other exogenous variables (EXV).
The main effect model is Negative Binomial. (since my dataset is over-dispersed)
I am wondering whether or not I can apply the control function approach for a two-stage regression. Specifically, is it valid to use the predicted probability from the first-stage to replace the EV in the second-stage?
1. logistic EV EXV IV
2. predict yhat
3. nbreg DV yhat EXV

I tried ivpoisson with gmm and since my model contains many variables and dummies, the ivpoisson gmm keep spinning and the criterion does not reduce.
I also tried pseudo-ML poisson to account for the over-dispersion issue, but I still need to deal with the binary EV.

Please can anyone advise what are the possible approaches and does the above CF approach work?


Thanks in advance.