I have a very large population database of patients who have been treated for localized prostate cancer with various therapies (surgery, radiation, systemic drugs, etc). It contains oncologic outcomes (e.g., time to recurrence, metastasis, death, death from prostate cancer, next therapy, or last followup if none of the above). This database has about 20,000 individuals in it longitudinally tracked over time, and is mostly used for comparative effectiveness research centered around survival analyses.
I would like to perform analyses whereby I select a subset of patients from this database having a similar distribution of baseline characteristics to patients from published, prospectively performed clinical trials.
For example, the published RTOG 9601 trial was a 2 arm randomized trial that investigated the addition (or omission) of a drug therapy to patients receiving radiation therapy after surgery for localized prostate cancer
Here is a table from the publication showing the baseline characteristics of the population:
Array
How would I go about selecting a population from my database of 20,000 patients that would match the distribution of these published patients?
The end goal is to see how different therapies applied to a similar, matched patient population compare to published studies. without the individual patient level data from the published studies, I am not sure how to proceed.
Here is my data structure that matches the categories above:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(race Age Karnofsky personid Gleason_Score) byte pick3 5 65.41547 3 50 1 1 7 66.90486 9 155 1 1 7 78.17933 3 200 1 1 7 60.4846 3 278 0 1 5 62.89117 9 339 1 1 7 . 9 500 0 1 7 69.65366 9 527 1 1 7 66.49418 9 901 2 1 7 73.58248 9 1103 1 1 7 66.428474 9 2193 2 1 7 . 9 5267 0 1 7 63.96714 3 5343 0 1 6 . 9 5388 0 1 4 . 9 5465 0 1 7 77.16632 3 7921 0 1 6 77.37167 9 8124 0 1 6 72.1013 9 8556 2 1 7 . 9 8992 0 1 7 69.89733 9 8994 2 1 7 81.27584 8 9017 2 1 7 65.98768 9 9126 0 1 6 70.57358 3 9855 2 1 7 58.82272 9 10155 1 1 5 70.8063 9 10244 2 1 7 71.66872 8 10395 1 1 7 60.94456 9 10734 0 1 7 65.420944 9 10863 1 1 7 69.9165 9 11436 1 1 7 66.49145 9 11453 1 1 7 70.87474 3 11537 0 1 7 64.19439 3 11622 1 1 7 73.52772 9 11716 2 1 7 51.40862 3 11844 0 1 7 63.01985 9 12173 1 1 7 56.53388 9 12293 1 1 7 74.06434 9 12427 2 1 7 65.94661 9 12914 1 1 7 47.20329 9 13012 2 1 7 68.98015 9 13277 2 1 7 59.82204 9 13591 2 1 end label values race race label def race 4 "Native Am.", modify label def race 5 "Other", modify label def race 6 "Unknown", modify label def race 7 "White", modify label values Karnofsky RT_KPS label def RT_KPS 3 "(Karnofsky) 100 - Normal, no co", modify label def RT_KPS 8 "(Karnofsky) 80 - Normal activit", modify label def RT_KPS 9 "(Karnofsky) 90 - Able to carry", modify label values Gleason_Score GS label def GS 0 "2-6", modify label def GS 1 "7", modify label def GS 2 "8-10", modify
Any help appreciated.
Cheers,
JT
0 Response to How can I select a study population matching a published prospective trial?
Post a Comment