sorting data

Hi Stata Forum,

I have a long dataset of hospitalization data. I am looking at number of lower limb amputations, by year.
For each year, I want to select only the highest level of amputation per person.

For example, if one individual (user ID=3) has four amputations in total between 2000 and 2015 (one in 2000, one in 2003, and then two in 2005) I want to include the amputation in 2000, the one in 2003, and then only the highest amputation in 2005. So, if in 2005, they had a toe AND a foot amputation, i only want to include the foot amputation.

I can sort the data (code below), so that i can take the FIRST amputation for each year (taking simply dup==1). But i cannot work out how to take the HIGHEST amputation for each year.

sort id date _year
gen dup=1
replace dup = dup[_n-1]+1 if id==id[_n-1] & _year==_year[_n-1]

*each amputation is coded as no(=0) or year(=1)

I apologize for the table below (but i am having difficulty installing the ado file (directory issues) to do it any other way).

id	date	year	toe	foot	ankle	leg	dup
3	03feb2000	2000	1	0	0	0	1
3	06jun2003	2003	1	0	0	0	1
3	02jan2005	2005	1	0	0	0	1
3	21feb2005	2005	0	1	0	0	2
4	30nov2008	2008	0	1	0	0	1
5	12aug2000	2000	1	0	0	0	1
5	11sept2010	2010	1	0	0	0	1
5	19sept2010	2010	0	1	0	0	2
5	8oct2010	2010	0	0	0	1	3

Any suggestions?

Many thanks
Jess

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / sorting data
sorting data

0 Response to sorting data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / sorting data sorting data

Related Posts with sorting data

0 Response to sorting data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / sorting data
sorting data