Dear all,
I am interested in estimating Y = A + B1*X1 + B2*X2 + B3*X3 + EPS.
I am not worried that any of (X1, X2, X3) is endogenous, but I am worried that I observe Y only in 80% of the cases for which I observe (X1,X2,X3).
One way to tackle this would be -heckman-, where I would first regress an indicator for observing Y, say D, on (X1,X2,X3) as well as an "instrument" Z that plausibly affects D but not Y on its own. Then in stage 2 I would estimate the equation for Y, but control for my estimate of D.
This procedure shares with IV estimation that I need to find an instrument Z which plausibly affects one variable (here D) but does not affect another variable (Y) through any other channel. However, I have read in various places that this Heckman procedure requires an additional normality assumption which IV estimation does not. Ceteris paribus that would make IV appear superior.
Yet I fail to see how I could tackle precisely the problem above with -ivregress-. Firstly, I would want to instrument not any specific regressor X, but the dummy for observing Y, which however does not explicitly feature in my main equation of interest. Secondly, I could not even estimate that equation on the full sample of observations precisely because I do not observe Y for 20% of them.
Can someone help me to plug that gap in my understanding, or is it that despite both needing a valid "instrument" (when the Heckman first stage does not rely simply on assuming a different specification) the procedures are yet too different and therefore each of the two procedures remains the best there is to address the problems for which they were invented?
Thanks so much and best regards,
PM
0 Response to Heckman AS IV?
Post a Comment