I currently am cleaning a very big dataset (52 variables, 82284 observations) for longitudinal analysis. The dataset is based on information returned from 6 different surveys. I have converted the dataset to long format so currently there are about 6 different observations (in years) for each ID. There are approximately 13,000 unique ID variables. This dataset is confidential so I have created a fake example dataset to use for this question (hopefully inserted correctly below).
So this is my issue - I have tried to create a "death after this wave" variable - which would indicate that this was the last wave of data from the person before dying. Therefore, I need to delete the waves that the person didn't participate in (so if someone only participated in three waves and then died == then only have 3 rows of data, whereas someone who was alive for all waves, will have 6 rows of data), however I am struggling to find a code that will achieve this. Does anyone have any ideas? Apologies, I am quite a novice!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double idalias int year float(wave_sg Death_After_This_Wave) 1 1901 0 . 1 1904 1 . 1 1907 2 1 1 1910 3 . 1 1913 4 . 1 1916 5 . 2 1901 0 . 2 1904 1 . 2 1907 2 . 2 1910 3 . 2 1913 4 1 2 1916 5 . 3 1901 0 . 3 1904 1 . 3 1907 2 1 3 1910 3 . 3 1913 4 . 3 1916 5 . 4 1901 0 . 4 1904 1 . 4 1907 2 . 4 1910 3 . 4 1913 4 . 4 1916 5 1 5 1901 0 . 5 1904 1 . 5 1907 2 1 5 1910 3 . 5 1913 4 . 5 1916 5 . 6 1901 0 . 6 1904 1 . 6 1907 2 . 6 1910 3 . 6 1913 4 1 6 1916 5 . 7 1901 0 . 7 1904 1 1 7 1907 2 . 7 1910 3 . 7 1913 4 . 7 1916 5 . end
I was thinking something like this: by idalias, sort: drop in 2/5 if _n=1 for Death_After_This_Wave (which to me means: for each ID, drop the years 1904 1907 1910 1913 1916 (i.e. observations 2 to 5) if the person has died just after the first observation (1901). I could then just edit this code and repeat it for the remaining years.
Thanks for taking the time to read my query.
Warm regards,
Sarah
0 Response to Panel data - Following the death of a participant, how to remove all following observations for them
Post a Comment