I want to drop all the observations that occur after the first time an event occurs, no matter if it happens again or not.
The following table is an example of the problems I'm facing:
id | year | event |
1 | 2007 | 0 |
1 | 2008 | 0 |
1 | 2009 | 1 |
1 | 2010 | 1 |
1 | 2011 | 1 |
1 | 2012 | 1 |
2 | 2007 | 0 |
2 | 2008 | 0 |
2 | 2009 | 1 |
2 | 2010 | 0 |
2 | 2011 | 0 |
2 | 2012 | 1 |
2 | 2013 | 0 |
3 | 2007 | 0 |
3 | 2008 | 1 |
3 | 2009 | 0 |
3 | 2010 | 0 |
3 | 2011 | 0 |
3 | 2012 | 0 |
3 | 2013 | 1 |
What I was trying to do was to sum the events, and drop if the sum was bigger than 1, as such:
by id (year), sort: gen byte sum = sum(event)
drop if sum>1
This works for id 1, but doesn't work for ids 2 and 3, because after the first event, the sum remains 1 for some observations after, since the event doesn't occur in the year following the one in which the event first occurs.
I can't seem to find a way to solve this. Any help would be much much appreciated!
0 Response to Dropping observations after first occurrence of an event
Post a Comment