I have a data set that some colleagues and I need to do a survival analysis on. The outcome is time to recurrence of cancer. The main predictor of interest is a dichotomous variable indicating whether or not the patient received chemotherapy. Another variable interest is a risk score that, in other contexts, is known to be a good predictor of the risk of recurrence. The model includes a few other variables, but for present purposes, it is just
Code:
stcox i.chemotherapy##c.risk_score
and this produces satisfactory results. Many of the patients have not had the risk_score calculation done (it requires a lab test that was not in widespread use until fairly recently). So we have developed an imputation model based on other patient and tumor characteristics, an imputation model that actually works rather well when compared to actual risk scores when tried out on data where the real risk scores were known. And we are repeating the Cox regressions with multiple imputation. So far so good.

The problem arises because clinicians are prone to using this risk score by categorizing it into low, medium, and high based on certain cutoffs. These cutoffs do not have a strong scientific basis, but have become entrenched in clinical practice. So, our target audience would prefer to see our results using the categorized version.

Code:
stcox i.chemotherapy##i.risk_category
which also runs nicely and produces results that are reasonable and more or less what we expected to see.

The problem arises when we then do this analysis with our multiply imputed data. Here's the problem. As you might imagine, the probability of recurrence is very low in the low level of risk_category. It is even lower still when chemotherapy has been done. While our data set is modest in size (N = 920), the number of non-censored observations where the risk category is low and there has been chemotherapy drops to zero in several of the imputations. (In the ones where it doesn't drop to zero it ranges between 1 and about 7). In those imputations that have this "empty cell," the regression coefficient becomes, in effect, negative infinity. (Actual numerical values are more like -1050, but you get the idea.) When these are averaged in with Rubin's rules, the multiply imputed regression coefficient is also, in effect, negative infinity.

I am sorely tempted to simply exclude those imputations that have a zero in that cell from the analysis. But really, that's unprincipled, and I'm searching for a better way. I thought of perhaps going Bayesian to regularize things using an informative prior. But my audience is not fond of Bayesian statistics. (And I've never tried to use -mi estimate- with the Bayesian commands. Can that even be done?) Is there a penalized-maximum likeilhood estimator for these models, and one that runs with -mi estimate-? Any other ideas?