Hello,

I am attempting to estimate a model:

Code:
logit y x1 x2 x3
where x1 is a dummy variable, x2 and x3 are continuous variables with non-normal distributions, and x1-3 have missing values. To fill in the missing values, I am using multiple imputation. One possible specification would be:

Code:
mi set flong
mi register imputed x1 x2 x3
mi impute chained (logit) x1 (pmm, knn(10)) x2 x3 = y, add(5) burnin(10)
However, testing has shown that x2 and x3 have a non-linear relationship; therefore I would like for the mi procedure to perform a customized regression during the first step of predictive mean matching. For example when imputing during iteration 2, I would like for the model for x2 to be specified such that x3 is binned into the categorical variable b3, which has ten bins for different ranges of values of x3, and the variable b3 to be included in the model as a factor variable. So, during the initial regression step when imputing x2, the code would theoretically be:

Code:
regress x2 x1 i.b3 y
instead of
Code:
regress x2 x1 x3 y
As a generalization, I would like for all continuous variables in the model to be continuous when they are the dependent variable, but to be binned into a specific number of automatically-determined bins when they are an independent variable.

Any help or guidance is deeply appreciated! I've played around with mata a few times before and think this may require defining a usermethod, but I'm hoping another method has been developed to handle this type of request. If not, any guidance as to where to find the right ado files on my system to "borrow" code from would be appreciated (I'm running Stata 15.1 on a Windows server).