I am using Stata 15, the eregress package and the cmp package. The following is the problem that I am facing.
Let's say:
Y: Wage, continuous
T: Whether you are treated, binary
D: IV for T
X: control
Z: Whether you work
My data set looks like the following:
# | Y | T | Z |
1 | 15 | 1 | 1 |
2 | 14 | 1 | 0 |
3 | 16 | 1 | missing |
4 | 17 | 0 | 1 |
5 | 18 | 0 | 0 |
6 | 19 | 0 | missing |
7 | 88 | missing | 1 |
8 | 5 | missing | 0 |
9 | missing | 1 | 0 |
My naive regression is : reg Y on T, X
The problem is : wages are only observed when the individual is on the labor market or when the Z variable takes the value of 1. So I want to do a Heckman selection like the following:
outcome equation: reg Y on T, X
selection equation: reg Z on T, X
Now the problem is T is also endogenous to Z, therefore I want to use the IV - D variable in the outcome equation, but not in the selection equation. At the current stage, I have 2 ways to go.
Way one - cmp package
code:
cmp (wage = T X) (selectvar = T_endo X ) (T = IV) , ind(selectvar*$cmp_cont $cmp_probit $cmp_cont)
where selectvar is generated by the following command: gen selectvar = wage<. (this follows the logic of the example in the cmp manual) T_endo is the variable I created to replace T, in order not to be instrumented by the IV in the thrid equation.
My questions are :
(1) does my code make sense in terms of what I want and what I have? since in this way, I didn't use my Z variable.
Way two - eregress
code:
eregress Y, entreat(T = IV) select(selectvar = T_endo X)
where selectvar is generated by the following command: gen selectvar = wage<.
My questions are:
(1) The code gives me the error message : can't find initial value. Can anyone help me sovle this error?
(2) does my code make sense in terms of what I want and what I have?
What would be my next try? Thank you all for the time and effort reading my post. I appreciate it.
Best
Xu
0 Response to CMP + eregress
Post a Comment