I have a data set with 10k observations for Y and an endogenous regressor X with many missing observations (90%) I have an instrument Z with no missings. I know that the values for X are missing at random.
I think, the naive approach would by to run
Code:
ivregress 2sls Y (X=Z)
Optimally I would run the first stage with the subsample with non-missing Xs and the the second stage on the full sample. Which should give me more power
How is this possible in Stata?
Are there issues I neglect?
Are there papers about this?
PS: I am cross-posting a similar question here: https://stats.stackexchange.com/ques...ous-regressors
0 Response to Instrumental variables with many missing values in the endogenous regressor
Post a Comment