I did my modelling using “svy:melogit” with the ‘subpop’ option in Stata 15.1 as follows after reading Hox [1], LEMMA, Sommet [2], Rabe-Hesketh [3], Carle [4], Enders [5], Kreft [6], Paccagnella [7] and many more [8].

Step 1: The null model
Step 2: Fixed effects at individual level entered as group mean centered versions (GMC) and their corresponding group means (GM): matgr (maternal age group), v190 (household wealth), v106 (maternal education), and sex (child’s sex) with their respective GM- gm1, gm2, gm4, and gm6
Step 3: Other group/cluster level contextual effects (prevalence of child marriage in cluster) and global variables (urban/rural and countries (4; varname- contr))
Step 4: Intra-level interactions added one at a time (all possible combinations), followed by adding all significant ones together
Step 5: Individual level variables introduced as random effects, one at a time, in the form of grand mean centered variable as well as a dummy variable for each category as suggested in LEMMA.
Step 6: Cross-level interactions (all possible combinations) added one at a time, followed by adding all significant ones together

At each step, the “final-step” model included only significant predictors which was then checked for adequate approximation at multiple quadrature points (8, 10, 15, and 20). In case of step 2, both versions of a variable (GMC and the GM) were kept in the model if either one was significant- the same goes for interactions. The “final-step” model was checked against the one from the previous step by likelihood ratio tests.

I tried some variations to the above by using only GMC in Step 2 and reintroducing the GMs in Step 3; using uncentered individual level variables in Step 2 with and without grand means (the latter is a bad idea as per Bell & Jones [9]); adding the global variables in Step 2; adding rural as a random effect.

Some of my Stata commands and the final model are as below:

Code:
*Rescaling of weights
*Level 1 weights using scaling method 1 (Carle, 2009): New weights sum to cluster/state sample size
            gen sqw = wt*wt
            egen sumsqw1 = sum(sqw), by(state)
            egen sumw1 = sum(wt), by(state)
            gen pwt11 = wt*sumw1/sumsqw1
             
* Survey setting
gen wt2=1 svyset state, weight(wt2) strata(v025) , singleunit(centered) || _n, weight(pwt11)
*Null model
svy, subpop(basep):melogit lbw||state:,
             mat a=e(b)

*Final model
svy, subpop(basep):melogit lbw i.contr##c.matgr_g i.contr#c.v106_g i.contr#c.sex_g i.contr#c.gm1 c.matgr_g#c.gm2 c.v190_g##c.v106_g c.v190_g#c.pc_ch_marriage c.sex_g c.gm1 c.gm2 c.gm4 c.gm6 c.pc_ch_marriage||state: , from(a, skip)
The total sample size is 1561370 while the base population is 81692 with 56 groups. My final results are displayed as predictions using ‘marginsplot’.

Below are two of the graphs that say that probability of low birth weight increases with increasing average household wealth in cluster (gm2)- which does not make sense and is incorrect [10].

Code:
*Graph 1
est restore s32 margins , subpop(basep) pred(pr) at(gm2=(1.715066 1.829877 2.03465 2.382972 2.588799 2.756229 3.115728 3.453595 3.989769 4.308882 4.619933)) vsquish post nose
 *Graph 2
est restore s32 margins , subpop(basep) pred(pr) at(gm2=(1.715066 1.829877 2.03465 2.382972 2.588799 2.756229 3.115728 3.453595 3.989769 4.308882 4.619933)) over(matgr) vsquish post nose
Array Array

So, where have I gone wrong? I will appreciate any advice on this.

Thank you
Deepali

References
  1. Hox J. J., 1995. Applied Multilevel Analysis. Amsterdam: TT-Publikaties
  2. Sommet N. & Morselli D., 2017. Keep Calm and Learn Multilevel Logistic Modeling: A Simplified Three-Step Procedure Using Stata, R, Mplus, and SPSS. International Review of Social Psychology, 30(1), 203–218, DOI: https://doi.org/10.5334/irsp.90
  3. Rabe-Hesketh S. & Skrondal A., 2006. Multilevel modelling of complex survey data. J. R. Statist. Soc. A, 169, Part 4, pp. 805–827
  4. Carle A. C., 2009. Fitting multilevel models in complex survey data with design weights: Recommendations. BMC Medical Research Methodology, 9:49 doi:10.1186/1471-2288-9-49.
  5. Enders C & Tofighi D., 2007. Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods, 12(2):121.
  6. Kreft I, De Leeuw J, & Aiken L., 1995. The effect of different forms of centering in hierarchical linear models. Multivariate Behavioral Research, 30(1):1-21.
  7. Paccagnella O., 2006. Centering or not centering in multilevel models? The role of the group mean and the assessment of group effects. Evaluation Review, Vol. 30 No. 1, 66-85
  8. Bell A. & Jones K., 2014. Explaining fixed effects: Random effects modeling of time-series cross-sectional and panel data. Political Science Research and Methods, doi:10.1017/psrm.2014.7
  9. Bell A. & Jones K., 2018. Understanding and misunderstanding group mean centering: a commentary on Kelley et al.’s dangerous practice. Quality & Quantity, Volume 52, Issue 5, pp 2031–2036
  10. Cerda M., L Buka S., & Rich-Edwards J. W., 2008. Neighborhood influences on the association between maternal age and birth weight: A multilevel investigation of age-related disparities in health. Soc Sci Med., 66(9): 2048–2060. doi:10.1016/j.socscimed.2008.01.027