I'd like to know why when I use multilevel models the estimates in of sample are differents of estimates out of sample, even when I use the same database.
When I make this procedure using logit I dont have problem, in other words, the estimates in or out of sample are the same.
Thanks
Code:
. melogit turismo idade filhos || pais:, nolog
Mixed-effects logistic regression Number of obs = 1,622
Group variable: pais Number of groups = 50
Obs per group:
min = 2
avg = 32.4
max = 118
Integration method: mvaghermite Integration pts. = 7
Wald chi2(2) = 52.18
Log likelihood = -1038.1176 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
turismo | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
idade | .0150543 .0066673 2.26 0.024 .0019866 .0281221
filhos | -.4239421 .0598524 -7.08 0.000 -.5412506 -.3066335
_cons | .4393717 .2954911 1.49 0.137 -.1397803 1.018524
-------------+----------------------------------------------------------------
pais |
var(_cons)| .2551942 .0880849 .1297368 .5019708
------------------------------------------------------------------------------
LR test vs. logistic model: chibar2(01) = 52.82 Prob >= chibar2 = 0.0000
. predict phat
(predictions based on fixed effects and posterior means of random effects)
(option mu assumed)
(using 7 quadrature points)
. preserve
. replace turismo = .
(1,622 real changes made, 1,622 to missing)
. predict phat2
(predictions based on fixed effects and posterior means of random effects)
(option mu assumed)
(using 7 quadrature points)
. list phat phat2 if pais=="Brasil"
+---------------------+
| phat phat2 |
|---------------------|
1198. | .6316937 .6069548 |
1199. | .491252 .4650681 |
1200. | .7533196 .7333011 |
1201. | .747682 .7273715 |
1202. | .4950149 .4688152 |
|---------------------|
1203. | .491252 .4650681 |
1204. | .4874901 .4613249 |
1205. | .717749 .6960087 |
1206. | .6659743 .6422338 |
1207. | .6068546 .5815524 |
|---------------------|
1208. | .6068546 .5815524 |
1209. | .6032571 .5778845 |
1210. | .6175761 .5925006 |
1211. | .6495774 .6253273 |
1212. | .6731711 .6496731 |
|---------------------|
1213. | .7207888 .6991845 |
1214. | .6862789 .6632519 |
+---------------------+
. restore
. logit turismo idade filhos, nolog
Logistic regression Number of obs = 1,622
LR chi2(2) = 49.40
Prob > chi2 = 0.0000
Log likelihood = -1064.5279 Pseudo R2 = 0.0227
------------------------------------------------------------------------------
turismo | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
idade | .0150046 .0063677 2.36 0.018 .0025241 .0274851
filhos | -.3869703 .056669 -6.83 0.000 -.4980395 -.2759011
_cons | .2924819 .2717957 1.08 0.282 -.2402278 .8251917
------------------------------------------------------------------------------
. predict phat3
(option pr assumed; Pr(turismo))
. replace turismo = .
(1,622 real changes made, 1,622 to missing)
. predict phat4
(option pr assumed; Pr(turismo))
. list phat3 phat4 if pais=="Brasil"
+---------------------+
| phat3 phat4 |
|---------------------|
1198. | .5887493 .5887493 |
1199. | .4555629 .4555629 |
1200. | .7032152 .7032152 |
1201. | .6969143 .6969143 |
1202. | .4592868 .4592868 |
|---------------------|
1203. | .4555629 .4555629 |
1204. | .4518439 .4518439 |
1205. | .6716723 .6716723 |
1206. | .6245352 .6245352 |
1207. | .5631031 .5631031 |
|---------------------|
1208. | .5631031 .5631031 |
1209. | .5594082 .5594082 |
1210. | .574144 .574144 |
1211. | .5988468 .5988468 |
1212. | .6237966 .6237966 |
|---------------------|
1213. | .6749727 .6749727 |
1214. | .6377733 .6377733 |
+---------------------+
.
Code:
0 Response to Predict using multilevel versus logit
Post a Comment