Greetings everyone,
I specify two models in my study with two different dependent variables; one of them is a 0-1 dummy (y1), and the other is a count variable (y2). For each model, the independent variable of interest is a count variable (w), which is potentially endogenous. Thus, in my situation, I encounter two cases: 1) a logit model with a count endogenous explanatory variable and 2) a negative binomial model with a count endogenous explanatory variable.
To address this possible endogeneity problem, I am trying to employ the 2SLS approach, in which the count endogenous explanatory variable is replaced with its fitted values estimated from a negative binomial first-stage regression. However, I read in Statalist that simply mimicking the standard 2SLS approach in non-linear models may not be the appropriate way to correct for endogeneity. As a result, I decided to employ the control function approach (which is a two-stage residual inclusion (2SRI) approach) proposed by Terza et al. 2008 (Two-stage residual inclusion estimation: Addressing endogeneity in health econometric modeling), as adjusted by Wooldridge 2014 (Quasi-maximum likelihood estimation and testing for nonlinear models with endogenous explanatory variables).
Specifically, I address the endogeneity problem in my case as follows with Stata commands:
1) In the first stage of 2SRI, a negative binomial regression is used in which the count endogenous variable (w) is regressed on two instruments (z1 and z2) and a set of controls (x1...xn):
nbreg w z1 z2 x1...xn, vce (cluster Firm)
2) Compute the generalized residuals (gr), as suggested by Wooldridge (2014):
predict gr, score
3) In the second stage of 2SRI, the generalized residuals, along with the count endogenous variable, are added to my two outcome models. Recall that y1 is a dummy and y2 is a count:
logit y1 w gr x1...xn, vce (cluster Firm)
nbreg y2 w gr x1...xn, vce (cluster Firm)
According to the above situation that I face in my research, I have two questions:
Q1: Are the procedures and Stata commands described above correct?
Q2: How can I evaluate the relevance and exogenous of my two instruments, z1 and z2? Can I employ the partial Chi-square test for instruments in the first stage to test for relevance? Also, can I employ the standard overidentification test in the non-linear context by regressing the second stage residuals on z1 and z2 and other controls (x1...xn) and multiplying the resulting R2 by 2 (the number of instruments) to get the test statistic?
I apologize for this long post.
Kindly help me answer my two questions. I am looking forward to your helpful insights.
Related Posts with 2SLS in non-linear models with count endogenous explanatory variable
Clustering standard errors with teffects ipwraDear all, I am using the teffects ipwra command to estimate the effect of a multi-level treatment a…
Resetting mlibs in running StataHi I'm building some mata code in which parts must be saved in Stata versions 12, 13, 14, or 15. Bas…
New version of vallabsave on SSCThanks as always to Kit Baum, a new version of the vallabsave package is now available for download …
Assessing differences between ordinal ranks based on frequenciesHi Stata users! Stata 14.1. I am trying to assess if there is a difference in the ordinal ranks of …
Combining dataHey all! I want to plot the birth date data against the birth weight data. However, it is very chall…
Subscribe to:
Post Comments (Atom)
0 Response to 2SLS in non-linear models with count endogenous explanatory variable
Post a Comment