why I would have different number of cases between models with MI'd data (possibly because of collinear dependencies of one varaible)?
Hi Statalist,
I am running three logistic models with the same DV with Stata 12.
Model 1 has all variables but the control variables.
Model 2 has all variables.
Model 3 has only the significant variables (only 3).
The regression results say Model 1 and 2 have 145 observations, but model 3 has 149 observations (which are all my cases). I wish to know why.
I did multiple imputation on all variable with missing data, so I do not think listwise deletion due to missing data should be the issue.
But one thing I notice that is different between a) models 1 and 2 and b) model 3, is that model 3 does not include one binary variable that in the model 1 and 2 output gives a logistic coefficient of 0 and odds ratio of 1 with both SEs omitted. I know this indicates that this variable is collinear with another and so is dropped, but I wish to know why this would reduce my number of cases and if there is anything i can do about it?
Or maybe I have different number of cases between the models for another reason?
I know that is best practice to have the same number of observations in all models so it would be great to get advice on how to resolve this.
I provide my output below:
model 1
. mi estimate, or: logistic passportdenied ethnicmin foreign intervention democracy social religion independence
Multiple-imputation estimates Imputations = 40
Logistic regression Number of obs = 145
Average RVI = 0.0000
Largest FMI = 0.0000
DF adjustment: Large sample DF: min = 1.68e+67
avg = 1.68e+67
max = .
Model F test: Equal FMI F( 6, 1.2e+69)= 1.15
Within VCE type: OIM Prob > F = 0.3295
--------------------------------------------------------------------------------
passportdenied | Odds Ratio Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
ethnicmin | 2.262649 2.020946 0.91 0.361 .3929555 13.02839
foreign | .5196286 .353581 -0.96 0.336 .1369284 1.971935
intervention | 4.023013 2.457557 2.28 0.023 1.214994 13.32076
democracy | 1.264409 1.129343 0.26 0.793 .2195902 7.280522
social | .7929084 .5783657 -0.32 0.750 .1898178 3.312143
religion | 1.325323 1.68382 0.22 0.825 .109868 15.98718
independence | 1 (omitted)
_cons | .0730259 .0703768 -2.72 0.007 .0110447 .4828366
--------------------------------------------------------------------------------
. *model 2 full model: including control variables + variables of interest
. mi estimate, or: logistic passportdenied bardate numdetained ageatbar male ethnicmin educyear foreign democracy social religion independence
Multiple-imputation estimates Imputations = 40
Logistic regression Number of obs = 145
Average RVI = 0.0447
Largest FMI = 0.1887
DF adjustment: Large sample DF: min = 1111.90
avg = 301944.31
max = 1676960.32
Model F test: Equal FMI F( 10,183109.0)= 0.72
Within VCE type: OIM Prob > F = 0.7038
--------------------------------------------------------------------------------
passportdenied | Odds Ratio Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
bardate | .9810666 .024959 -0.75 0.452 .9333471 1.031226
numdetained | 1.253062 .4282628 0.66 0.509 .6412875 2.448456
ageatbar | .9462083 .0273604 -1.91 0.056 .894019 1.001444
male | .9228291 .7386526 -0.10 0.920 .1922208 4.430391
ethnicmin | 2.227631 2.148039 0.83 0.406 .3365465 14.74489
educyear | 1.046962 .0832463 0.58 0.564 .8957871 1.223648
foreign | .9347841 .5921365 -0.11 0.915 .2700964 3.235219
democracy | 2.196149 2.257036 0.77 0.444 .2929911 16.4615
social | 1.096905 .8634727 0.12 0.906 .2344802 5.131356
religion | 2.74065 3.729275 0.74 0.459 .1903649 39.45665
independence | 1 (omitted)
_cons | 6.38e+15 3.23e+17 0.72 0.473 4.24e-28 9.59e+58
--------------------------------------------------------------------------------
****in my simplified, best fit model final model3 , I only include only intervention
> and age at bar*/
. mi estimate, or: logistic passportdenied intervention ageatbar
Multiple-imputation estimates Imputations = 40
Logistic regression Number of obs = 149
Average RVI = 0.0651
Largest FMI = 0.1566
DF adjustment: Large sample DF: min = 1612.71
avg = 421957.67
max = 1261997.47
Model F test: Equal FMI F( 2, 9397.0) = 3.87
Within VCE type: OIM Prob > F = 0.0209
--------------------------------------------------------------------------------
passportdenied | Odds Ratio Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
intervention | 3.246078 1.791845 2.13 0.033 1.100253 9.57691
ageatbar | .9524412 .0250589 -1.85 0.064 .9045364 1.002883
_cons | .4047996 .3786314 -0.97 0.334 .0646604 2.534206
--------------------------------------------
0 Response to why I would have different number of cases between models with MI'd data (possibly because of collinear dependencies of one varaible)?
Post a Comment