Dear Stata Users,

Normally, predicting survival curves within study time after Cox or any parametric models is pretty straight forward. Yet, I'm interested in predicting survival curves or risks beyond study time. I searched extensive Stata materials and did not find any.

Here is my data:

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(time remission group id) double(logwbc sex lwbc3) float trt
 6 1 0  3 2.31 0 2 1
 6 0 0 10  3.2 0 3 1
 6 1 0  2 3.28 0 3 1
 6 1 0  1 4.06 1 3 1
 7 1 0  4 4.43 0 3 1
 9 0 0 11  2.8 0 2 1
10 1 0  5  2.7 0 2 1
10 0 0 12 2.96 0 2 1
11 0 0 13  2.6 0 2 1
13 1 0  6 2.88 0 2 1
16 1 0  7  3.6 1 3 1
17 0 0 14 2.16 0 1 1
19 0 0 15 2.05 0 1 1
20 0 0 16 2.01 1 1 1
22 1 0  8 2.32 1 2 1
23 1 0  9 2.57 1 2 1
25 0 0 17 1.78 1 1 1
32 0 0 18  2.2 1 1 1
32 0 0 19 2.53 1 2 1
34 0 0 20 1.47 1 1 1
35 0 0 21 1.45 1 1 1
 1 1 1 23  2.8 1 2 0
 1 1 1 22    5 1 3 0
 2 1 1 24 4.48 1 3 0
 2 1 1 25 4.91 1 3 0
 3 1 1 26 4.01 1 3 0
 4 1 1 27 2.42 1 2 0
 4 1 1 28 4.36 1 3 0
 5 1 1 29 3.49 1 3 0
 5 1 1 30 3.97 0 3 0
 8 1 1 33 2.32 0 2 0
 8 1 1 31 3.05 0 3 0
 8 1 1 32 3.26 1 3 0
 8 1 1 34 3.52 0 3 0
11 1 1 35 2.12 0 1 0
11 1 1 36 3.49 0 3 0
12 1 1 38  1.5 0 1 0
12 1 1 37 3.06 0 3 0
15 1 1 39  2.3 0 1 0
17 1 1 40 2.95 0 2 0
22 1 1 41 2.73 0 2 0
23 1 1 42 1.97 1 1 0
end
The data includes the following variables:
  • time: survival time
  • remission: 0=censored; 1=cancer relapse
  • group: 0=placebo; 1=treatment
  • logwbc: log-transformed number of white blood cells
  • sex: 1=male; 0=female
I used my codes below and not sure if it is correct or not and wondering if there is any other method to serve this prediction beyond study time.

Code:
stset time, failure(remission) //time in weeks
sum _t //the maximum survival time of the study = 35 weeks

stcox trt logwbc sex, nohr nolog
predict double xbeta, xb //calculate each individual overall hazard coefficient
predict double basesurv, basesurv //predict each individual survival curve at baseline.

gen newtime=_t+35 //I want to predict risk of cancer relapse at week 70, so I generated this variable
sum newtime ////the maximum survival time of the study is now 70 weeks

sum basesurv if newtime<70
gen risk70weeks=1 - r(min)^exp(xbeta)

sum risk70weeks //risk of cancer relapse at week 70 ranges from 21.76 to 100%.
I'm grateful and looking forward to any of your helps.