About the Dataset

I am working with DHS (Demographic and Health Survey Data) data. DHS uses a two-stage cluster sampling process. In first stage, clusters (Primary Sampling Units) are randomly selected with probability proportional to their size. In second stage, 20-30 households are randomly selected from each cluster. Most DHS have more than 300 clusters. However, large countries like India (Present in the analysis) have far more clusters (25000). I have pooled 3-4 waves each from 25 countries making the total samples as 95. No PSU information is missing.

About the Model

My dependent variable is neghaz (negative of height for age (cm/months)) which is continuous in nature. My regression specification includes several control variables including square terms and interaction terms. The specification also includes variables that have been calculated at PSU/cluster level (Mean Employment Rate in the Cluster, etc) and also variables at country level (GDP, Average Life Expectancy etc.). I have already de-normalized the weights.

Issue

I am trying to evaluate the following 3 level hierarchical model (respondents <- clusters <- surveys) mixed neghaz $controlset [pw=weight] || psu: || survey:

Survey represents each of the 95 samples in the data.

The model failed to converge. After that I tried a null model. The null model also failed to converge. I am not able to understand why null model fails to converge when there are 95 surveys and every survey has 300 clusters at least.

I also tried the null model after converting neghaz in to a dichotomous variable (xtmelogit) stunted which takes the value 1 if the child is stunted (chronic malnutrition). The convergence failed again

Afterwards, I tried running 2 level models with PSUs and Surveys independently. The models worked with the full control set. However, the standard errors were different in the two models.

ICC for model with PSU – 0.98; ICC for model with survey – 0.02

Questions

Can somebody help me to understand why is convergence failing and how to fix it?

Can I safely neglect the survey random effects in this case?

Is there any other way of combining the survey effects (Random /Fixed) along with the PSU random effects?

I also tried models with only survey fixed effects (i.survey with normal ols) However, the standard errors were different. What model shall I finally use in such a case?

Sorry for the long post.