Dear Statalisters,

My question relates to survival analysis in the presence of time-varying hazards.

From my understanding of the econometrics literature, duration dependence is when the baseline hazard changes over the duration of analysis time.

This could be due to unobserved heterogeneity (spurious), where those with high hazard rates are removed from the population, such that the data increasingly consists of low hazard individuals.

Or this could be due to true duration dependence, where the value of the hazard actually does depend on the amount of time elapsed.

A third consideration is variable duration dependence, where the hazard functions for a covariate could vary over time to different extents. I suppose this would lead to time-varying effects.

My question is, do these considerations apply to the Royston Parmar models from stpm2? If time-varying effects are found when using stpm2, can these be interpreted as true (versus spurious) time-varying effects? Or would frailty have to be incorporated into the models?

I read in a few papers that flexible models (such as stpm2) are robust to unobserved heterogeneity. Is this true?

I created an example below showing time-varying hazard ratios for the covariate x4b. The baseline hazard exhibits duration dependence, first increasing then decreasing then constant. The hazard rates for different levels of x4b have different shapes. Thus the hazard ratios comparing the different levels of x4b fluctuate about 1.00 due to the different hazard rate patterns.

Jenny

Code:
webuse brcancer, clear

stset rectime, failure(censrec==1) exit(time 2659)

/* baseline hazard */

stpm2, df(3) scale(hazard)
capture drop bhazard*
predict bhazard, hazard
line bhazard* _t, sort

/* hazard rates for x4b */

stpm2 x4b hormon x1 x2, df(3) dftvc(4) knscale(time) scale(hazard) tvc(x4b)

predict hazard0, hazard at(x4b 0) zeros
predict hazard1, hazard at(x4b 1) zeros
line hazard* _t, sort

/* hazard ratio for x4b */

stpm2 x4b hormon x1 x2, df(3) dftvc(4) knscale(time) scale(hazard) tvc(x4b)

predict hr, hrnum(x4b 1) hrdenom(x4b 0) ci
line hr* _t if hr_lci>0.2 & hr_uci<6, lpattern(solid dash dash) lcolor(black red red) sort yscale(log) yline(1) yscale(range(0.2(.2)6))