Hi everyone,

I am new to multiple imputation. I have a large dataset (60,000+ observations) with data from 7 years. Data for a variable (that I called "missing_variable" in the example below) in the first two years are missing due to collection issues. I am trying to impute it using a dozen other variables in the data (including the variables that I am using in my analytical model).

Here is my code so far:


Code:
mi set wide 
mi xtset, clear 
mi register imputed missing_variable

mi impute pmm missing_variable control1 control2 control3.... control12 i.year , add(5) rseed(25000) noisily dots force knn(30) bootstrap
I've been using pmm because most of these variables are not normally distributed.

The results are a little strange. About 80% of imputed values are equal to a single number: 1.115355. That is true for all 5 imputed variables.

Anyone have an idea of what might be going on here?

Jack