Dear Statalist,
Here is an example of my data, two of its variables are CaseID and Surveyyear:
CaseID |
Surveyyear |
1 |
2004 |
1 |
2004 |
1 |
2004 |
2 |
1999 |
2 |
1999 |
2 |
1999 |
2 |
2000 |
2 |
2000 |
2 |
2000 |
3 |
2002 |
3 |
2002 |
3 |
2009 |
3 |
2010 |
3 |
2010 |
For each group of caseID that has more than 1 value of Surveyyear, I want to keep observations that have the maximum value of Surveyyear. If there is only 1 value of Surveyyear in a group, I'll keep all of the observations in that group. It means, I want to generate this following result:
CaseID |
Surveyyear |
1 |
2004 |
1 |
2004 |
1 |
2004 |
2 |
2000 |
2 |
2000 |
2 |
2000 |
3 |
2010 |
3 |
2010 |
Could someone help me? I tried using 'collapse' and, 'keep if _n == _N', but only one observation in each group is kept, not all the observations that have the same value.

Thank you so much in advance.
Best regard,
Cameron.
0 Response to Keep all the observations that have maximum value of a variable in each group
Post a Comment