Dear Statalists,
I am confused about how to correct for selection on one independent variable.
I want to estimate Y_ft=beta*Certified_ft+Z_ft in a sample from 2000-2013. Here f, t indicate firm and year respectively. Z are exogenous variables. Certified_ft means whether the firm gets a certification in year t. However, Certified_ft is only observed for firms survived in 2018. (I combined two datasets: one reports Y from 2000-2013; the other reports when firms got certified for firms survived in 2018)
So I face two selection issues: 2) the endogeneity of Certified_ft: factors that affect Certified and Y at the same time. I developed an instrument z1 for it; 2) the survivor bias: I only observe Certified_ft for firms survived in 2018. I wondered how to correct for these biases. I thought of two possibilities:
1) Semykina & Wooldrige (2010) corrected for endogeneity and selection. It is similar to Heckman two-stage method. However, it applied to selection on dependent variables, rather than independent variables.
2) Control function approach in Imbens & Wooldrige (2007) (page 4). First estimate a probit model of Prob(Certified_ft) on instruments z2 (hoping to correct for the survivor bias), obtain its predicted probabilities p2, then estimate Y on Certified, Z and p2, probably using 2SLS (with z1 as the instrument for Certified).
It is a little complicated as I face two layers of selection. Do you think method 2 can help me address this problem? Or what approach else would you recommend?
Any comments would be appreciated! Thank you.
Best,
K
Related Posts with Correct for Selection on Independent Variables
Difference of two means significantly different to zeroHey guys, I need your help. So I calculated the mean of 2 groups and then calculated the difference …
Panel Data reshapeDear all, I have a quick question on how I could reshape my below data into the "long" format? The …
Formatting date from MM/DD/YYYY.I'm trying to format my date but I keep getting a type mismatch error. My dates are labeled as "MM/D…
How come label changes aren't registered by StataHi I just discovered that changes in labels is not captured by either c(changed) or describe (In the…
Visualising results from multiple regressions - problem with large coefficients (outlier)Dear all, I am currently struggeling with following problem: I want to display the average marginal…
Subscribe to:
Post Comments (Atom)
0 Response to Correct for Selection on Independent Variables
Post a Comment