Dear Statalists,
I am confused about how to correct for selection on one independent variable.
I want to estimate Y_ft=beta*Certified_ft+Z_ft in a sample from 2000-2013. Here f, t indicate firm and year respectively. Z are exogenous variables. Certified_ft means whether the firm gets a certification in year t. However, Certified_ft is only observed for firms survived in 2018. (I combined two datasets: one reports Y from 2000-2013; the other reports when firms got certified for firms survived in 2018)
So I face two selection issues: 2) the endogeneity of Certified_ft: factors that affect Certified and Y at the same time. I developed an instrument z1 for it; 2) the survivor bias: I only observe Certified_ft for firms survived in 2018. I wondered how to correct for these biases. I thought of two possibilities:
1) Semykina & Wooldrige (2010) corrected for endogeneity and selection. It is similar to Heckman two-stage method. However, it applied to selection on dependent variables, rather than independent variables.
2) Control function approach in Imbens & Wooldrige (2007) (page 4). First estimate a probit model of Prob(Certified_ft) on instruments z2 (hoping to correct for the survivor bias), obtain its predicted probabilities p2, then estimate Y on Certified, Z and p2, probably using 2SLS (with z1 as the instrument for Certified).
It is a little complicated as I face two layers of selection. Do you think method 2 can help me address this problem? Or what approach else would you recommend?
Any comments would be appreciated! Thank you.
Best,
K
Related Posts with Correct for Selection on Independent Variables
Lagged non-linear independent variablesI came across a paper that write the following: The first set of independent variables, REPRESSION(1…
Looping running sum and total for changing range within grouped variables.Hi everyone, I'm relatively new to Stata so please excuse if the explanation of my problem sounds a…
Total sum of squares when using reg and noconstant In the artificial data set below I estimated depvar = beta0 + beta1*year and depvar = beta1*year. U…
Plotting residuls for first and second stage of "SLSDear Stata experts, i am using 2sls to estimate the impact of water quality on health with state 14…
saving two files using foreachDear statalist, I tried to load two files, make some changes, and save them in one go. However I ju…
Subscribe to:
Post Comments (Atom)
0 Response to Correct for Selection on Independent Variables
Post a Comment