Hello,

I have a time series model where my dependent variable y is binary and one of my regressors is continuous and endogenous (problem of simultaneity - y influences var1 while var1 also influences y).

I don't have an instrument so I thought about instrumenting the variable through its residuals after predicting it. Since my var2 var3 var4 and var5 may also have a problem of endogeneity with var1, I need to lag them.

At first I thought about doing this by hand, such that:

Code:
reg var1 l.var2 l.var3 l.var4 l.var5
predict double resid_var1, residuals
probit y resid_var1 var2 var3 var4 var5 var6 var7
Which I don't think falls in what is considered the forbidden regression.

But then I read about command - cdsimeq - which apparently does all of this in one step.

Code:
cdsimeq (var1 l.var2 l.var3 l.var4 l.var5) (y var2 var3 var4 var5 var6 var7), instpre
However, I get different results whenever I employ each one.

My main issue is that I don't know if what I'm doing in reg/predict/probit is correct and why command cdsimeq uses variables from the y regression to explain var1 (that is, the first stage looks like it is explaining why like:

Code:
reg var1 l.var2 l.var3 l.var4 l.var5 var2 var3 var4 var5 var6 var7
This is problematic for me, since because I am facing endogeneity issues, using var2 and l.var2, for instance ruins my analysis.

Anyone can explain or offer some advice on how to do this?

I am using Stata 14 on Mac.

Best,

Jonas