Hi,

I fit a simple variance component multilevel model with a large longitudinal dataset (obs>60,000). The model has three levels: obs->individual->community. The dependent variable 'phi' is a binary variable, with 95% 0 and 5% 1.

I tried both STATA and R ('lme4' package)

The STATA code is:

melogit phi || commid: || idind:

The R code is:

fit <- glmer(phi~ (1 | commid/idind), data = dt, family = binomial("logit"))


The problem is that STATA can quickly estimate this model, while R gives an error message: "Model failed to converge with max|grad| = 0.0377982 (tol = 0.001, component 1)".

I reckon the problem is the unbalanced dependent variable, which R struggles to handle (I tried other more balanced binary dependent variables, R managed).


Does anyone know why STATA and R have so different performance in this case?