Thanks to your previous help, I was able to create an indicator variable for the number of moves per individual within my dataset (marked by the *have column below). What I'm trying to do now is drop (or convert to missing) everything but the highest value of moves per individual.
Essentially, I wish to determine the total number of individuals who moved more than once within my single year of data - but when I tabulate the *have column, it returns an overcounted number of moves since, for example, ind_id 2 is counted three times. I would like to keep the highest number of moves, the *want column, but remove all other values such that when I tabulate *want, it will give me an accurate figure without overcounting or multi-counting a single individual.
Perhaps I'm glossing over a simple syntax, but I'm at a loss. Thank you!
ind_id | seq_id | city | *have | *want |
1 | 1 | "Hope" | - | - |
1 | 2 | "Hope" | - | - |
1 | 3 | "Aurora" | 1 | 1 |
1 | 4 | "Aurora" | - | - |
1 | 5 | "Aurora" | - | - |
2 | 1 | "Hope" | - | - |
2 | 2 | "Aurora" | 1 | - |
2 | 3 | "Aurora" | - | - |
2 | 4 | "Jackson" | 2 | - |
2 | 5 | "Jackson" | - | - |
2 | 6 | "Hope" | 3 | 3 |
0 Response to Dropping all but the highest value per individual, multiple records
Post a Comment