This is a new topic for me, so pardon my basic understanding of the heckman model. In fact, it's possible I should not be using a selection model at all, so I wanted to check first.
My dependent variable (individual-level) is species diversity captured by a birdwatcher in a district-time period. The independent variable is deforestation in the district-time period. Many individuals go looking for specific birds, so their data points are less useful for eliciting the general impact of deforestation on species diversity. The goal is to identify the individuals who capture everything in sight and focus on them for the analysis.
I plan to define a "veteran" birdwatcher as someone capturing a reasonably representative measure of diversity based on some predefined criteria. This could include the total trips they take, the number of months per year they go out, and whether they report all species during the trip. These predict veteran status but are not part of the outcome equation.
Initially, I dropped all observations that didn't meet the selection criteria. My understanding is that doing this biases my coefficients by truncating the distribution of error terms. Can I treat my issue as a selection model?
My idea is to generate a dummy=1 for veterans and 0 for non-veterans, based on the above criteria. Then I would estimate coefficients and standard errors with the heckman command. Is this the right approach? Is it a problem that I am "incidentally truncating" the data myself, and then correcting for it? Thanks.
Related Posts with Should I use a heckman selection model when extracting a subset of my data for analysis?
Selecting 3 controls per caseHello everyone. I know a couple of topics have covered this but I can't seem to figure out how to se…
Question about cumulative incidence plotDear Stata specialists, I have some questions of how to draw cumulative incidence plot using Stata.…
Obtaining individuals' ages from their dates of birthHi all. I have data on the socio-economic characteristics of mayors in Brazil in 2004. Please see pa…
Creating a new variable with date and time togetherHi, I am trying to create a new variable that has the date (bedtime_date) and time (q4_time_outofbe…
Storing odds ratio and p-value in logit and ologit from multiple variablesHi everyone! I have 10 dependent variables (y1-y10) and 4 independent variables (x1-x4) with x1 as t…
Subscribe to:
Post Comments (Atom)
0 Response to Should I use a heckman selection model when extracting a subset of my data for analysis?
Post a Comment