Hi,

I have missing data at level 2 for predictor variables, and I can't find a way to do multiple imputation (bearing in mind I am nowhere near a expert in either Stata or multilevel models).

My study is looking at children's engagement in classrooms, so I have 4 levels, with children being level 2, i.e the structure of the dataset is school> class> child> observation. I have about 20-30 observations per child, and 5 children per classroom. I have 1669 observations in total. I'm mainly interested in the impact of activity on engagement (i.e. of a level 1 predictor variable).

I also have teacher questionnaires about the children's personality traits (i.e. level 2 predictor). However, I am missing questionnaires for 1 class = 5 children. I have a range of issues with the missing data: the data are missing for all of the same participants, there are 8 of these level 2 variables, and I don't really have a lot that can be used to predict their values. My only other data are children's gender, as age is also missing for those children, as well as the engagement scores, but the latter is my outcome variable.
I read that there are methods for doing this, but the author didn't elaborate and I think they would probably be beyond me unless there is a neat stata command to do it.

I tried to do multiple imputation but it gives me level 1 imputations (i.e. different personality scores depending on the observation), which is obviously not appropriate.

At the moment, the only solution I have found is to do listwise deletion (though I'm aware of the issues), compare the impact on my level 1 variables between the models with and without that classroom (i.e. 1669 and 1570 observations respectively), and treat the results about the level 2 variables with caution (I'm presenting them separately and as exploratory).


Any thoughts welcome.

Thank you in advance (and Merry Christmas!)