I want to run a simple regression with one variable that is available only for female.
I have four independent variables: age, weight, height, menstrual_cycle. The dependent variable is the patient's health (measured by the times of visiting clinic).
Approach 1:
As the variable menstrual_cycle is available for female only, I delete observations for male respondent.
keep if (female == 1)
reg health age weight height menstrual_cycle
Approach 2:
However, my friend said that I should use as many observations as possible. He recommends me to impute menstrual_cycle = 0 for male, and include a dummy variable gender in the regression.
The regression becomes:
reg health age weight height menstrual_cycle gender
Does sample selection bias exist if I run regression using Approach 1? Thank you.
0 Response to How to run a regression with missing value for one group
Post a Comment