I am a student, using STATA in my project.
This is my first study using survival analysis and Cox regression.
There are some problems in my study. I tried to find the answer in many sources, but impossible.
Could you please help me?
Thank you.
1. I used the survival analysis with attained age. I run the stcox, and then estat phtest, detail. Most of covariates in the Cox regression models violate the PH assumption. Then I used stpm2, added all covariates that violated the PH assumption into the tvc. Is it proper or not when adding all violated variables into the tvc? Or, only adding the interested exposures is enough?
Code:
. * 2.SURVIAL since ATTAINED AGE at diagnosis*
.
. stset exit_date, fail(Death==1) id(id) enter(indexdate) origin(birthdatescb) scale(365.24)
id: id
failure event: Death == 1
obs. time interval: (exit_date[_n-1], exit_date]
enter on or after: time indexdate
exit on or before: failure
t for analysis: (time-origin)/365.24
origin: time birthdatescb
------------------------------------------------------------------------------
265,173 total observations
34,043 observations end on or before enter()
------------------------------------------------------------------------------
231,130 observations remaining, representing
231,130 subjects
157,782 failures in single-failure-per-subject data
733,687.56 total analysis time at risk and under observation
at risk from t = 0
earliest observed entry t = 45.0115
last observed exit t = 108.4438
Code:
. stcox Sve i.education_merge i.income_cat sex i.kommun_types live_alone cci
failure _d: Death == 1
analysis time _t: (exit_date-origin)/365.24
origin: time birthdatescb
enter on or after: time indexdate
id: id
Iteration 0: log likelihood = -1529196.3
Iteration 1: log likelihood = -1517040.7
Iteration 2: log likelihood = -1516187.9
Iteration 3: log likelihood = -1516181.5
Iteration 4: log likelihood = -1516181.5
Refining estimates:
Iteration 0: log likelihood = -1516181.5
Cox regression -- Breslow method for ties
No. of subjects = 224,116 Number of obs = 224,116
No. of failures = 152,876
Time at risk = 711694.7076
LR chi2(10) = 26029.69
Log likelihood = -1516181.5 Prob > chi2 = 0.0000
-----------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
------------------+----------------------------------------------------------------
Sve | .5996859 .0035806 -85.64 0.000 .592709 .6067449
|
education_merge |
Upper secondary | .9361713 .0053124 -11.62 0.000 .9258169 .9466414
University | .8898578 .0097304 -10.67 0.000 .8709894 .9091349
|
income_cat |
Middle tertile | .9723618 .0060747 -4.49 0.000 .9605283 .9843412
Highest tertile | .9077582 .006752 -13.01 0.000 .8946205 .9210887
|
sex | .7143673 .0042241 -56.88 0.000 .706136 .7226946
|
kommun_types |
Intermediate | 1.00545 .0064891 0.84 0.400 .9928117 1.018249
Rural | .9984355 .0066947 -0.23 0.815 .9854001 1.011643
|
live_alone | 1.094858 .006443 15.40 0.000 1.082302 1.107559
cci | 1.148494 .0013698 116.09 0.000 1.145812 1.151181
-----------------------------------------------------------------------------------
.
. estat phtest, detail
Test of proportional-hazards assumption
Time: Time
----------------------------------------------------------------
| rho chi2 df Prob>chi2
------------+---------------------------------------------------
Sve | 0.00967 14.20 1 0.0002
1b.educati~e| . . 1 .
2.educatio~e| 0.00206 0.65 1 0.4210
3.educatio~e| 0.00280 1.19 1 0.2747
1b.income_~t| . . 1 .
2.income_cat| 0.00677 7.00 1 0.0082
3.income_cat| 0.02074 65.91 1 0.0000
sex | 0.00740 8.44 1 0.0037
1b.kommun_~s| . . 1 .
2.kommun_t~s| 0.01065 17.29 1 0.0000
3.kommun_t~s| 0.01770 47.72 1 0.0000
live_alone | -0.02901 130.41 1 0.0000
cci | -0.04345 263.37 1 0.0000
------------+---------------------------------------------------
global test | 542.40 10 0.0000
----------------------------------------------------------------
2. When I added the all of the violated covariates into the tvc of stpm2, STATA keeps showing this memo " note: delayed entry models are being fitted. Iteration 0: log likelihood = 321425.8 (not concave) ...". It takes hours for STATA to show the result. Could you please explain to me what happens?
Code:
. stpm2 Sve education_merge2 education_merge3 income_cat2 income_cat3 kommun_types2 kommun_types3 sex live_alone cci, scale(hazard) df(4)
> tvc(S income_cat2 income_cat3 kommun_types2 kommun_types3 sex live_alone cci) dftvc(3) eform
note: delayed entry models are being fitted
Iteration 0: log likelihood = 328328.54 (not concave)
Iteration 1: log likelihood = 328413.17 (not concave)
Iteration 2: log likelihood = 328576.82 (not concave)
Iteration 3: log likelihood = 328665.52 (not concave)
Iteration 4: log likelihood = 328688.49 (not concave)
Iteration 5: log likelihood = 328710.02 (not concave)
Iteration 6: log likelihood = 328720.01 (not concave)
Iteration 7: log likelihood = 328729.98 (not concave)
Iteration 8: log likelihood = 328735.29 (not concave)
Iteration 9: log likelihood = 328760.66 (not concave)
Iteration 10: log likelihood = 328771.01 (not concave)
Iteration 11: log likelihood = 328780.6 (not concave)
Iteration 12: log likelihood = 328789.31 (not concave)
Iteration 13: log likelihood = 328807.42 (not concave)
Iteration 14: log likelihood = 328818.96 (not concave)
Iteration 15: log likelihood = 328825.58 (not concave)
Iteration 16: log likelihood = 328832.69 (not concave)
Iteration 17: log likelihood = 328842.97 (not concave)
Iteration 18: log likelihood = 328847.31 (not concave)
Iteration 19: log likelihood = 328850.81 (not concave)
Iteration 20: log likelihood = 328852.44 (not concave)
Iteration 21: log likelihood = 328854.3 (not concave)
Iteration 22: log likelihood = 328855.65 (not concave)
3. If I don't use stpm2, then I use the tvc options of stcox. It takes many hours for STATA to finish the command. I am using STATA 15 MP2, with very strong computer. Do you have the similar waiting time when running stcox with tvc option?
4. I set the survival analysis with attained age. I don't need to adjust for age in my Cox regression, do I? How about if I set the survival analysis with time since diagnosis date?
5. I plotted a Kaplan Meier curve, with the attained age. I tried to adjust the x-axis scale between 45 and 115. But the x-axis in the graph stills shows scale between 0 and 115. Could you please explain to me? How can I make the scale between 45 and 115?
Code:
sts graph, by(patient) title ("Survival since birth-attained age between Sve and non-Sve patients",size(3.5) style(heading)) ylabel(0.2(0.2)1,
labsize(small)) xscale(range(45 115)) xlabel(45(10)115, labsize(small)) xtitle("Attained age", size(3)) ytitle("Survial rate",size(3))
legend(order(1 "Sve" 2 "non-Sve") ring(0) position(7) rows(2) size(small)) caption("Log rank test p<0.001",size(small)) name (d,replace)
I'm sorry for asking a lot of questions. But I am at a beginner level.
Thank you.
0 Response to How to deal with multiple covariates violating the proportional hazard assumption?
Post a Comment