Hello everyone,

I am working towards building a statistical model that predicts basic income preferences amongst voters using data from Wave 8 of the European Social Survey (2016). To control for country-level variation, I am using a multilevel logistic regression model, where observations at the level of individual respondents (level 1 units) are nested within countries (level 2 units). I have a fair amount of experience conducting regression analysis in STATA, but I am effectively a novice at multilevel modeling and lack a conceptual background in the technique.

The dependent variable is a binary indicator of whether a respondent supports or opposes basic income. I have three sets of independent variables that I am looking to test in a series of models: individual demographic variables (such as age and income,) individual attitudinal variables (various preferences about redistribution,) and country-level contextual variables (such as the level of means-tested welfare spending in the respondent's country.) The first two sets of variables represent level 1 units, while the third set of variables represent level 2 units. To be clear, I am looking to (1) isolate the effects of individual-level predictors while controlling for country-level heterogeneity as well as (2) estimate the effect of country-level predictors on basic income preferences.

1) I've read a decent amount of statistical literature suggesting that multilevel models with a small level-2 sample size leads to biased estimates for the level-2 standard errors. There are 21 countries included in my analysis. Will this pose an issue for the reliability of my country-level estimates? Is there another model or regression structure that I should consider in order to circumvent this problem? Some papers have suggested that a fixed effects model would allow me to estimate the “moderating effect” of a country-level variable through cross-level interactions. However, I am interested in measuring the direct effect of each country-level variable, particularly because I have many level 1 and level 2 variables, and it would be too time-constraining to measure moderating effects between a country-level variable and every individual-level variable.

2) Is it necessary to specify a random slope in the regression model? In what situation would it be appropriate to do so? Currently, the syntax of my base regression structure looks like

Code:
 melogit basicincome agea age2 gndr hinctnta eduyrs uemp5yr mbtru_curr mbtru_prev RTIscore [pw=pweight]|| cntry:
where there is no specification of a coefficient in the random effects syntax, introduced after the "||".

3) Is there any way to calculate or determine the explained variance of each of the models?