Dear Statalist,
Here is an example of my data, two of its variables are CaseID and Surveyyear:
| CaseID |
Surveyyear |
| 1 |
2004 |
| 1 |
2004 |
| 1 |
2004 |
| 2 |
1999 |
| 2 |
1999 |
| 2 |
1999 |
| 2 |
2000 |
| 2 |
2000 |
| 2 |
2000 |
| 3 |
2002 |
| 3 |
2002 |
| 3 |
2009 |
| 3 |
2010 |
| 3 |
2010 |
For each group of caseID that has more than 1 value of Surveyyear, I want to keep observations that have the maximum value of Surveyyear. If there is only 1 value of Surveyyear in a group, I'll keep all of the observations in that group. It means, I want to generate this following result:
| CaseID |
Surveyyear |
| 1 |
2004 |
| 1 |
2004 |
| 1 |
2004 |
| 2 |
2000 |
| 2 |
2000 |
| 2 |
2000 |
| 3 |
2010 |
| 3 |
2010 |
Could someone help me? I tried using 'collapse' and, 'keep if _n == _N', but only one observation in each group is kept, not all the observations that have the same value.

Thank you so much in advance.
Best regard,
Cameron.
0 Response to Keep all the observations that have maximum value of a variable in each group
Post a Comment