Hi all,
I am trying to set up survival data for a cohort of breast cancer patients, following them from their breast cancer diagnosis until date of death or end of 2017, whichever is earliest (I.e. I want alive patients on 31 Dec 2017 to be censored at that date).
I set up my survival analysis as follows;
gen timetodeath=min(dateofdeath,td(31dec2017))-dateofdiag)/365.25
My outcome of interest is breast cancer death, which is a dichotomous variable in my dataset representing if a woman died of breast cancer or not (0=didn’t die of bc, 1=did die of breast cancer).
I then stset my data as follows;
stset timetodeath failure(bcdeath=1) id(patientid)
When I check my ‘timetodeath’ variable, patients have been correctly followed up for the length of time I want them to be followed up for (I.e. from their date of diagnosis until death or end of 2017, whichever is earliest).
However, when I check the ‘_d’ variable that STATA produces when stsetting data, some patients who died of breast cancer AFTER 31 Dec 2017 are being counted as a failure, and aren’t being censored. This is strange to me; as I mentioned, they are being followed up for the correct amount of time, but it seems like they’re being followed until 31 Dec 2017 and then still being counted as a failure/event.
What is going on here? How can I fix my data set up so women who haven’t died by 31 Dec 2017 get censored then and aren’t counted as a death?
Any help would be much appreciated.
Thanks!
Related Posts with Basic question on setting up survival data
Changing the display of a date variableHi all, I have a year-month variable which I have formatted with %tm. Is there a way to change the …
Kakwani indexDear all, I am trying to calculate the Kakawni index (K) , which is the difference between the conc…
Kakwani indexDear all, I am trying to calculate the Kakawni index (K) , which is the difference between the conc…
Panel Data - mean/median by specific variableHi, I have panel data with several stocks and dates. I have variable X which is defined for each da…
Codes for Randomly selecting from a particular cluster but a certain percentageHi everyone, wondering if anyone can help as I seem stuck. I have a dataset with 105 hospitals and …
Subscribe to:
Post Comments (Atom)
0 Response to Basic question on setting up survival data
Post a Comment