Sorry if this is long, but I think the context is needed.
I have a regression for estimating likelihood of enrolling in a hypothetical program from a survey based on price and several demographic variables. We also have a dependent variable for enrollment in a hypothetical free program, which we're using as the selection variable (and in the survey, enrollment in the free program was a prerequisite for seeing the paid program question - i.e. if someone answered "No" to the free program, they would not be asked if they would participate in the program with a cost). Price for the paid program was randomly assigned ($=3, 6 or 9), and running an OLS for predicting enrollment based on price indicates the effect of price is statistically significant and negative (p less than 0.001 - both with demographic controls as well as with just price as the IV), and a cross tab of enrollment and price suggests that as well.
However, when I run the Heckman selection model with free enrollment as the selection variable and paid enrollment as the main dependent variable, the model fails to converge and the coefficient on price goes to zero (-.00000000972 after 100 iterations). I get a variety of "not concave" and "backed up" errors, depending on the specific variation of the code I run. The chi-squared term even goes down with more iterations. I've played around with different initial values for the coefficients, removing different variables, using the "difficult" option, and playing around with the convergence thresholds, but nothing seems to get the model to estimate the effect of price well. However, if I add a random variable that's really correlated with the free program enrollment, the model converges really quickly (12 iterations), and estimates a strong negative impact of price for the paid program. I have another similar model (just with different types of programs) using the same basic commands / code and it works well (and produces expected results). Is that suggesting that there's so little predictive ability of the IVs for the free program that the Heckman method just doesn't work? If I run a probit for the free program, I get a pseudo R-squared of ~0.4, which is similar to the R-squared I get running OLS for the free program.
The base code is:
heckman paidseparate price2 x_2 x_3... x_20, select(freeseparate= x_2 x_3... x_20)
I also have a version where I specify the initial values of the coefficients (which I got by running an OLS of the IVs on the DVs separately), as well as just putting in an initial value for price and leaving everything else as 0. So the full code is:
heckman paidseparate price2 black hispanic otherrace college female South Northeast West fulltime parttime age income1000 urban totalhousehold children socialproof_recycling rec_responsible_self rec_responsible_others rec_pleasant, select(freeseparate= black hispanic otherrace college female South Northeast West fulltime parttime age income1000 urban totalhousehold children socialproof_recycling rec_responsible_self rec_responsible_others rec_pleasant) iterate(100) from(-.057 0.091 0.034 0.029 -0.101 -0.035 -0.016 -0.006 -0.069 0.096 0.1 -0.002 0.006 -0.001 0.005 0.05 0.228 0.086 -0.015 0.054 .33 -0.011 0.026 -0.023 0.086 0.044 0.021 -0.07 0.017 -0.029 -0.031 -0.001 0.003 0.071 0.028 -0.047 -0.039 0.033 -0.01 -0.05 .7 -.5 -.6, copy)
Any thoughts or ideas? Please let me know if you have any clarifying questions.
Related Posts with Heckman Selection model failing to converge
why I can import complete data?Hello teachers please see the picture, why can't I import complete data? The "import first row as va…
Exporting data from multiple univariate regressions into one big table in excelHi all, I have used a foreach loop to do univariate logistical regression of multiple variables wit…
Exporting data from multiple univariate regressions into one big table in excelHi all, I have used a foreach loop to do univariate logistical regression of multiple variables wit…
how to use the stata "bar" to graph a picture like thishow to use the following data to graph a picture like this clear input double 系数 str15 行业 .363 "农业采矿…
Exporting data from multiple univariate regressions into one big table in excelHi all, I have used a foreach loop to do univariate logistical regression of multiple variables wit…
Subscribe to:
Post Comments (Atom)
0 Response to Heckman Selection model failing to converge
Post a Comment