Hi there,

I have 2 datasets; one shows the data of patients who were diagnosed with cancers and the other one shows the details of medications they use. Both datasets have multiple rows for one patient (ex: multiple cancers for 1 patient in the cancer data and multiple medicines for 1 patient in the medication data). I want to merge these 2 datasets to find out how is the medication use of patients with cancers. As this will be a m:m merging I’m not sure whether the usual stata code will appropriately work out.

Below is a part of 2 datasets that I want to get merged. Highly appreciate if you can help me to find a suitable command to merge them properly.

Thank you in advance!
Thushani

Medication dataset
input str11 ID str1 gender long medication_code float(frequency daily_dose)
"1" "M" 70238 1 1
"1" "M" 70238 1 1
"1" "M" 70238 1 1
"2" "F" 70238 1 1
"2" "F" 67117 1 1
"2" "F" 67117 1 1
"3" "M" 67117 1 1
"3" "M" 70238 1 1
"4" "F" 74121 1 1
"4" "F" 67265 1 1
"4" "F" 67265 1 1
"5" "M" 70238 2 2
"5" "M" 70238 2 2
"5" "M" 70238 2 2
"5" "M" 70238 2 2

Cancer dataset
input str11 ID str1 gender int age str4 site int morph
"1" "M" 79 "C160" 8070
"2" "F" 74 "C20" 8140
"2" "F" 74 "C187" 8140
"2" "F" 74 "C250" 8140
"3" "M" 75 "C187" 8140
"3" "M" 75 "C250" 8000
"4" "F" 85 "C259" 8000
"4" "F" 85 "C187" 8140
"4" "F" 85 "C250" 8000
"5" "M" 78 "C187" 8140
"5" "M" 78 "C187" 8140
"5" "M" 78 "C221" 8160
"5" "M" 78 "C20" 8140
"5" "M" 78 "C20" 8140