Hi,
I am trying to perform a two-step model to account for selection bias. My question relates to how you include the variables in each step. Actually, I am trying to replicate the approach in a paper. In the first step, they include a number of variables, (e.g. gender, age, marital status etc), but in the regression the start with only gender as control and sequentially adding variables to see how family or labour characteristics affect the gender gap.
My question is if I should keep all the variables in the probit part and only sequentially add the new control variables that I want in the regression part. Some of them are common in both steps which is OK. I just want to check if my approach is correct or if I should only include common variables in both steps but have just one extra variable in the probit so that the estimation does not suffer from endogeneity. Below is also my code.
capture program drop qr11a
program define qr11a, rclass
probit prob1 gender Dage Dage1 Dchild childI /*
*/ Dmar DmarI DmarII marI marII marIII ageI ageII earningsD earningsI if ra0300<61
tempname b
mat `b' = e(b)
predict double xb1, xb
g phi=normalden(xb1)
g PHI=normal(xb1)
g lambda=phi/PHI
reg nwealtht gender lambda Dage Dage1 Deduc Deduc1 Dchild nmar nmarI nmarII if partner==0 & ra0300<61
matrix bb = `b', e(b)
scalar b_g=(bb[1,1])
return scalar b_g=bb[1,1]
scalar b_l=(bb[1,2])
return scalar b_l=bb[1,2]
drop phi PHI lambda
end
mi estimate, cmdok vceok: qr11a nwealtht gender lambda Dage Dage1 Deduc Deduc1 Dchild nmar nmarI nmarII if partner==0 & ra0300<61
Thanks,
Ilias
Related Posts with Two-step model and inclusion of variables
Small issue in installing -rcall-Hi stateliest, I'm trying to install a package named "rcall", but has a small issue with that. I'm u…
How to split lines into separate ones?Dear all, My data has about 200,000 observations and it has two distinct ids (Rank and NCTId), Date…
PDF documentation don't show up in STATA 16 using Ubuntu 20.04.2 LTSI just installed STATA 16 in my Ubuntu machine and everything is running well. However, when I need …
Interpreting marginal effect of gologit2 using mfxHi all, I'm running a generalised ordered logit regression using the gologit2 command and then usin…
Panel data regression using non linear outcomeDear all. I am new to stata. I have a panel dataset on these variables: city days PM2.5 PM10, Ambien…
Subscribe to:
Post Comments (Atom)
0 Response to Two-step model and inclusion of variables
Post a Comment