I have two sets of data and I append them together. The first set consists of a series of events and the year the event happens. There is only 1 event per year for each firm. A firm could have multiple events in different years. I call this data the "post" data, and I have a dummy post to indicate whether the data is from the "post" data. The other set of data consists of the year before the event year. For each firm in the post data, there is a corresponding pre year. I call this data the "pre“ data. In the "pre“ data, post == 0.
I have the following variables: symbol, year to identify which firm it is and of which year, and a post variable that I already explained. If a firm has the event in two consecutive years, say in 2012 and 2013, the pre year for 2012 event is 2011, and the pre year for 2013 event is 2012. The problem arises here as 2012 is both the pre and the post year. I don't want such situations to happen, so I only want to keep 2011 (pre) and 2012 (post) and delete 2012 (pre) and 2013(post) for this firm. I use -duplicates tag Symbol Year, gen(flag)- to indicate the duplicates (in this example, 2012 pre and post are the duplicates), and I could drop the one with post == 0 & flag == 1. I don't know how to drop the observation in the following year (in this example, the observation for 2013).
Here is some data
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long Symbol float(Year post) byte flag 62 2016 0 0 62 2017 1 0 62 2018 0 0 62 2019 1 0 416 2013 0 0 416 2014 1 1 416 2014 0 1 416 2015 1 0 426 2013 0 0 426 2014 1 0 426 2018 0 0 426 2019 1 0 547 2012 0 0 547 2013 1 0 557 2013 0 0 557 2014 1 0 560 2010 0 0 560 2011 1 0 2356 2013 0 0 2356 2014 1 1 2356 2014 0 1 2356 2015 1 0 2713 2014 0 0 2713 2015 1 1 2713 2015 0 1 2713 2016 1 0 end format %ty Year
Any help would be appreciated. Thanks!
0 Response to delete observations according to some rules
Post a Comment