I am working with panel data for two waves at the moment. The dependent variable (outcome) is a binary variable (0 / 1). There are multiple observations of the outcome variable for every ID and wave.
I have imputed the missing values for the outcome variable by using:
mi impute logit outcome (predictor variables), augment force add(20)
ID | wave | outcome
1 | 1 | .
1 | 1 | 1
1 | 1 | .
1 | 2 | .
1 | 2 | 1
1 | 2 | 0
Up to this point, everything worked well. However, with regard to my research question, I am interested in whether at least one observation for the outcome variable is ==1 for every ID and wave. In other words, I want to generate a new variable based on the imputed variable that identifies whether one observation per ID and wave has a value of 1. Therefore, I generated two new variables: total_`num'_outcome which sums up the values of the outcome variable by ID and wave. _`num'_outcome_g1 is a binary variable (0 / 1) that identifies whether total_`num'_outcome is 0 or >1. The latter identifies whether at least one observation of the outcome variable for every ID and wave is 1.
foreach num of numlist 1/20 {
bysort ID wave: egen total_`num'_outcome = total(_`num'_outcome)
by ID wave: gen _`num'_outcome_g1 = 1 if total_`num'_outcome>0 & total_`num'_outcome!=.
replace _`num'_outcome_g1=0 if total_`num'_outcome==0
}
This results in:
ID | wave | outcome | _mi_miss | _`num'_outcome | total_`num'_outcome | _`num'_outcome_g1 | and so on
1 | 1 | . | 1 | 1 | 2 | 1 |
1 | 1 | 1 | 0 | 1 | 2 | 1 |
1 | 1 | . | 1 | 0 | 2 | 1 |
1 | 2 | . | 1 | 1 | 2 | 1 |
1 | 2 | 1 | 0 | 1 | 2 | 1 |
1 | 2 | 0 | 0 | 0 | 2 | 1 |
For the next step, I want to keep only one observation for every ID and wave regarding the outcome variable. To estimate my models, I want stata to use the aggregated variables _`num'_outcome_g1 instead of the imputed values in _`num'_outcome.
My questions:
(1) Is such an appraoch possible in stata?
(2) How do I run the estimation command on the _`num'_outcome_g1 variables instead of the "original" imputed data in _`num'_outcome?
Best regards
Fabian
Related Posts with panel data: aggregate imputed variables to a new variable using mi
How to use estat concordance in a validation sample Hi, I am trying to internally validate the Cox proportional hazard model. I have split data (70% on…
Thresholds displayed on ROC CurveI'm graphing a binary variable on a 7-category variable, and I'd like the 7 categories to be labeled…
Global macro for a directory using project fileThe following commands are fickle, not reliable, working sometimes and not at others. I use a projec…
Model for Panel Data with time-variant DV and time-invariant IVsHello, My panel data is short (N>T) and I would like to regress 4 time invariant IV on 1 time va…
Transpose regression table, adding S.E. and P valueHello There, I am currently trying to transpose my regression results (i.e having one model in a ro…
Subscribe to:
Post Comments (Atom)
0 Response to panel data: aggregate imputed variables to a new variable using mi
Post a Comment