Hi
I am running an multilevel logistic regression on dhs data of ukraine.I amnot sure about the result.I need an expert opnion ,some one who could assure that I did it right..I want to investigate the factors of tobacco smoking.For random intercept I used region.Which is categorical variable(North,South,east,west).For level1 model my variables are(age,gender,work or not,highest education level,marital status).I run random model first without introducing level1 variables.I used xtmelogit smoke || region:


Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

region: Identity
var(_cons) .0660989 .0433564 .0182752 .23907

LR test vs. logistic model: chibar2(01) = 122.11 Prob >= chibar2 = 0.0000


Than I use patient risk score by running logistic regression on all predictor variables.Later used that smokerisk variable for fixed effect model.

Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

region: Identity
var(_cons) .0660989 .0433564 .0182752 .23907

LR test vs. logistic model: chibar2(01) = 122.11 Prob >= chibar2 = 0.0000

. xi:logistic smoke age i.ms i.highedu i.wealthi i.cwork i.gender
i.ms _Ims_0-5 (naturally coded; _Ims_0 omitted)
i.highedu _Ihighedu_0-3 (naturally coded; _Ihighedu_0 omitted)
i.wealthi _Iwealthi_1-5 (naturally coded; _Iwealthi_1 omitted)
i.cwork _Icwork_0-1 (naturally coded; _Icwork_0 omitted)
i.gender _Igender_0-1 (naturally coded; _Igender_0 omitted)

Logistic regression Number of obs = 12,210
LR chi2(15) = 2764.91
Prob > chi2 = 0.0000
Log likelihood = -4906.05 Pseudo R2 = 0.2198


smoke Odds Ratio Std. Err. z P>z [95% Conf. Interval]

age 1.004373 .0034311 1.28 0.201 .997671 1.011121
_Ims_1 1.314048 .1226059 2.93 0.003 1.094437 1.577727
_Ims_2 3.14551 .4351557 8.28 0.000 2.39847 4.125227
_Ims_3 1.641184 .289318 2.81 0.005 1.161722 2.318528
_Ims_4 3.149873 .3790349 9.53 0.000 2.488084 3.987686
_Ims_5 3.116891 .5685119 6.23 0.000 2.180043 4.45634
_Ihighedu_1 3.54832 3.780326 1.19 0.235 .4397094 28.63386
_Ihighedu_2 .4787269 .4745827 -0.74 0.457 .0685891 3.341341
_Ihighedu_3 .3140836 .3114419 -1.17 0.243 .0449783 2.193247
_Iwealthi_2 .8653848 .070379 -1.78 0.075 .7378766 1.014927
_Iwealthi_3 1.242929 .1050261 2.57 0.010 1.053224 1.466804
_Iwealthi_4 1.108127 .1003171 1.13 0.257 .9279647 1.323267
_Iwealthi_5 1.047131 .093587 0.52 0.606 .8788707 1.247604
_Icwork_1 1.589846 .1090885 6.76 0.000 1.38979 1.818699
_Igender_1 12.57861 .7383958 43.13 0.000 11.21153 14.11238
_cons .1005307 .1002092 -2.30 0.021 .0142501 .7092155

Note: _cons estimates baseline odds.

. predict smokerisk,xb

. xtmelogit smoke smokerisk region:,var

Refining starting values:

Iteration 0: log likelihood = -4876.2025 (not concave)
Iteration 1: log likelihood = -4871.9337
Iteration 2: log likelihood = -4871.5384

Performing gradient-based optimization:

Iteration 0: log likelihood = -4871.5384
Iteration 1: log likelihood = -4871.2656
Iteration 2: log likelihood = -4871.2629
Iteration 3: log likelihood = -4871.2629

Mixed-effects logistic regression Number of obs = 12,210
Group variable: region Number of groups = 5

Obs per group:
min = 1,889
avg = 2,442.0
max = 3,145

Integration points = 7 Wald chi2(1) = 2230.59
Log likelihood = -4871.2629 Prob > chi2 = 0.0000


smoke Coef. Std. Err. z P>z [95% Conf. Interval]

smokerisk .9993888 .0211604 47.23 0.000 .9579151 1.040862
_cons -.0028393 .1065424 -0.03 0.979 -.2116585 .2059799



Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

region: Identity
var(_cons) .0509872 .0341417 .0137241 .1894251

LR test vs. logistic model: chibar2(01) = 69.57 Prob >= chibar2 = 0.0000

so the variance droped from .0660989 to .0509872 .
This actually means .22862256 or almost 23 percent regional variance can be explained by personal characteristics.
When I try to run this model with cluster number result get messed up instead of decrease in variance it increases.I am confused kindly advice.