Hi all,

Thank you in advance for your comments and suggestions!

We are conducting a case-crossover study to understand how individual risk behaviors and environmental risks lead to some acute outcomes (e.g. heart attacks). To be more specific, because we don't know exactly what would be an appropriate "control window", we included the acute event as the final point (D=1), and divided the time since a subject got up in the morning until the event into hourly intervals as control periods (D=0). In other words, the last point is always "1" (acute event), and there could be a varying number of "control windows" (e.g. if a subject got up in the morning early and had the acute event late at night, he/she would have more data points in the analysis than a subject who got up late and had the acute event in the afternoon).

Our goal is to determine whether the set of risk behaviors (more than one time-varying predictors) or the set of environmental risks (again more than one time-varying predictors) have better predictive performance. Following standard analytic steps for case-crossover designs, we conducted conditional logistic regressions (-clogit-, group (study_id)) and then estimated pseudo R-squared.

Now we'd also like to estimate (e.g using "predict" then "roctab") the receiver operating characteristics (ROC) curves, the AUC, and positive and negative predictive values under the different conditions (e.g. Model 1: risky behaviors as predictors only vs. Model 2: environmental risks as predictors only).

I wonder whether estimating ROC, AUC, sensitivity, specificity, and PPV/NPV in case-crossover design (or say conditional logistic regressions) really makes sense. I heard different opinions from different people and really want to know what are the best ways to evaluate predictive performance in a case-crossover design.

Any suggestions are appreciated!