Dear Stata Specialists,

I'm trying to use flexible parametric model to estimate Hazard ratio over time (at 1 year, 5 years and 10 years after start of follow up) and visualize HR change over time using survival data. Because of violation of proportional hazard assumption, I didn't use Cox.

Some brief info about my data:
outcome: development of CVD (0/1)
main exposure: RA (0/1)
covariates: age (continuous), sex (0/1), region (categorical noted by 0-20)
censoring: death during follow-up, emigration during follow-up or end of study (0/1)
follow-up time: in years (from diagnosis of RA to development of CVD or censoring)

*As this is a matched cohort, so I assum the HRs for the 3 covariates are constant over follow-up time.

I used the following coding to draw HR change over time:
stset pyears, failure(outcome)
destring ra, replace
stpm2 ra sex1 age i.region1, scale(h) df(5) eform tvc(ssc) dftvc(3)

range temptime 0 14 101

predict hr, hrnumerator(ra 1) ci timevar(temptime)

twoway (rarea hr_lci hr_uci temptime, color(red%25)) ///
(line hr temptime, sort lcolor(red)) ///
, legend(off) ysize(8) xsize(11) ///
ytitle("Hazard ratio (SSc patients/comparator)") name("hr", replace) ///
xtitle("Years since cohort entry")

So, My first question is that how should I decide the value of df(), dftvc() and the last number following -range-? And how to interpret the exp(b) of variable ra?


The following coding was for generating HRs at 1, 5 and 10 years:
generate t1=1
generate t5=5
generate t10=10

predict hr1, hrnumerator(ra 1) timevar(t1) ci
predict hr5, hrnumerator(ra 1) timevar(t5) ci
predict hr10, hrnumerator(ra 1) timevar(t10) ci

I'm not sure that did these coding make sense in terms of my purpose? Could the results be displayed in form of table?

Thank you in advance!

Best,
Z.