I am working with Stata/MP 15.1 and have appended two datasets.

Database 1 includes a list of patients that had a specific operation. Each patient is only represented once in this dataset.

Database 2 includes the full medical record of every patient in database 1. Each patient will likely therefore be represented many times.

I think I have been able to identify the date of the first readmission after discharge from hospital using:

Code:
bysort patient: egen firstreadmit = min(readmission)
format firstreadmit %td
However, what I would really like to do is:
  1. Return the cause of the first readmission, i.e. the "readmissioncausecode" variable.
  2. Identify the date and cause of the second readmission, which I can't easily do using min(readmission).
  3. Only identify patients being readmitted for a specific reason, e.g. readmissioncausecode==400.
I realise that this is three questions in one post but it seemed easier to present them all together with the context of what I am trying to achieve. If anyone can provide any hints or point me in the direction of how I could achieve (1), (2), or (3), I would be very grateful.

I have included a dummy dataset below, which broadly reflects the structure of the databases I am using.


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(database patient) float(opdate dischargedate readmission) int readmissioncausecode
1 1 19002 19007     .   .
1 2 19400 19408     .   .
1 3 19389 19422     .   .
2 1     .     . 20739 400
2 1     .     . 21138 444
2 1     .     . 21268 999
2 2     .     . 21165 400
2 2     .     . 19919 233
2 2     .     . 20630 889
2 3     .     . 21268 889
2 3     .     . 21165 889
2 3     .     . 19919 123
2 3     .     . 20630 223
end
format %d opdate
format %d dischargedate
format %d readmission