I have a dataset which looks something like below:
ID | Amount | month | max_amount | max_month | N |
1 | 10 | 1 | 20 | . | 2 |
1 | 20 | 2 | 20 | 2 | 2 |
2 | 10 | 1 | 10 | 1 | 3 |
2 | 10 | 2 | 10 | 2 | 3 |
2 | 5 | 3 | 10 | . | 3 |
3 | 20 | 2 | 20 | 2 | 1 |
4 | 30 | 4 | 30 | 4 | 1 |
egen max_amount =max(Amount), by(ID)
bysort ID: gen max_month = month if Amount>= max_amount
bysort ID: gen N=_N
My problem is with the variable max_month. For my analysis I need one unique value for each ID. Notice that for the ID 2, there are 3 observations. Two of these observations have same value for "amount", i.e. 10. So my maximum amount is 10 but the months corresponding to these 2 "maximums" are different. Now for practical purposes, I am willing to consider only the earliest month among the duplicate "maximums", i.e. I want to retain only the earliest month with a maximum amount for each ID . I tried several ways to do this. Following is a failed attempt at a loop :
foreach num i=1/N {
replace max_month[`i'] = max_month[`i'-1] if max_month[`i']>max_month[`i'-1]
}
I am also wondering if I am not searching for the correct "keywords" on Google to look for an answer to this. Any help would be appreciated.
Thank you!
0 Response to Using a loop to compare observations of a variable
Post a Comment