I'm a new one to this forum, and I wanted to ask for some help regarding a (maybe naive) question that I could not find specific answers to it yet.
I'm currently dealing with a panel data whose observations could be uniquely identified by deal-firm-year. Therefore, a firm could appear multiple times during one year if it involved in more than one deals.
My question is that: how could I only keep those observations which have at least one-year non-missing (based on some other vars, for example, non-missing assets) data both before and after each deal (event)? An example of the data is shown as following:
firm_id | deal_id | year | deal_year | window | assets | sales |
1001 | 0 | 1985 | 1984 | 1 | 5 | 3 |
1001 | 1 | 1998 | 1999 | -1 | 10 | 4 |
1001 | 1 | 2000 | 1999 | 1 | 15 | 7 |
1001 | 2 | 2000 | 2002 | -2 | 15 | 7 |
1001 | 2 | 2001 | 2002 | -1 | 45 | 35 |
1001 | 2 | 2003 | 2002 | 1 | 32 | 24 |
1002 | 2 | 2001 | 2002 | -1 | 59 | 45 |
1002 | 2 | 2003 | 2002 | 1 | 51 | 43 |
1002 | 2 | 2004 | 2002 | 2 | 56 | 23 |
1002 | 3 | 2004 | 2005 | -1 | 56 | 23 |
1002 | 3 | 2006 | 2005 | 1 | . | 65 |
Ideally, I want to obtain the following dataset after manipulation (the red rows has been removed since they did not meet the selection criteria):
firm_id | deal_id | year | deal_year | window | assets | sales |
1001 | 1 | 1998 | 1999 | -1 | 10 | 4 |
1001 | 1 | 2000 | 1999 | 1 | 15 | 7 |
1001 | 2 | 2000 | 2002 | -2 | 12 | 7 |
1001 | 2 | 2001 | 2002 | -1 | 45 | 35 |
1001 | 2 | 2003 | 2002 | 1 | 32 | 24 |
1002 | 2 | 2001 | 2002 | -1 | 59 | 45 |
1002 | 2 | 2003 | 2002 | 1 | 51 | 43 |
1002 | 2 | 2004 | 2002 | 2 | 56 | 23 |
Best,
Ray
0 Response to How should I keep only observations with consecutive data around certain events without missing values in some vars
Post a Comment