I am working with survival data, using individual time-at-risk until outcome, death or end of follow-up. It is for the purpose of calculating incidence-rates and doing cox-regression analysis.

If you see the picture that i have attatched, my problem is that i have a dataset where every patient-admission, patient-contact and hospital-transfers are registered, eventhough the patient maybe only had one admission to the hospital (sorry for the photo-quality, but i am working on protected software).

Under ”personnummer” you can see that the number ending with "2863" contributes with 12 registrations, where two of these have the diagnosiscode DS826 included. In this situation, i only need observation 114, because it has the earliest admission date, and then to delete the other eleven observations.
Then in relation to patients without the outcome, where an example under "personummer", is the patient with the number ending with "4283”. Here i need to delete all observations, except one. My first question is, which one would be the right one to keep, in terms of admission date, and how do i do this in the most practical way? Keep in mind, that this is both seperate admissions and department-registrations mixed together

- My next question is, in terms of stata-commands; what commands do i need to delete the duplicates that i do not want, and keep the ones i do want?

I hope this is understandable, and i would love to clarify or chat with someone, who believe they can help. Thank you
Array