I would like to seek your advice regarding my analysis sample. Specifically, I have a three-wave panel dataset that is described as follows:
- Wave 1: is a baseline of the survey. I named this wave as "wave1" in the data example and code (see below)
- Wave 2: includes two different samples: i) a follow-up survey of wave 1 (named as wave1_followup) and ; ii) a fresh sample to compensate for attrition occurred in wave 1 (named as wave2_fresh).
- Similarly wave 3 compresses two different datasets: a follow-up survey of wave 1 + two samples of wave 2 (named as wave2_followup); and ii) a fresh sample to compensate for attrition occurred in wave 2 (named as wave3_fresh)
I impose two restrictions to my analysis sample, that are I limit my sample to individuals who participated in two or more interviews and who have at least one living parent at the time of their first interview. the following code are how I construct my panel data with the two restrictions, but I am not sure what I did is correct. Thus, any advice would be highly appreciated.
I first keep only individuals whose parents are alive in their first interview, meaning wave1, wave2_fresh and wave3_fresh are used here
Code:
use `wave1', clear keep if parent_alive==1 // keep only those with either father or mother alive sort id tempfile wave1_new save `wave1_new' use `wave2_fresh', clear keep if parent_alive==1 // keep only those with either father or mother alive tempfile wave2_fresh_new save `wave2_fresh_new' use `wave3_fresh', clear keep if parent_alive==1 // keep only those with either father or mother alive tempfile wave3_fresh_new save `wave3_fresh_new'
Code:
* Merge follow-up data of wave 1 to the fresh sample in wave 2 use `wave1_followup', clear merge 1:1 id using `wave2_fresh_new', nogen tempfile wave1_2 save `wave1_2' * Merge follow-up data of wave 1 + wave 2 to the fresh sample in wave 3 use `wave2_followup', clear merge 1:1 id using `wave3_fresh_new', nogen tempfile wave2_3 save `wave2_3' * Append dataset and drop those who are observed once append using `wave1_2' append using `wave1_new' isid id year, sort by id: egen nwave = max(_N) drop if nwave==1 // drop those who are observed once
Code:
*** Wave 1 clear input int id float(year age) byte sex float parent_alive 1 2007 62 1 1 2 2007 75 1 0 3 2007 58 1 0 4 2007 64 1 1 5 2007 52 0 1 6 2007 65 1 0 7 2007 54 0 0 8 2007 54 1 0 9 2007 64 0 0 10 2007 71 0 0 11 2007 56 0 0 12 2007 66 1 0 13 2007 68 1 0 15 2007 57 1 1 16 2007 58 1 1 17 2007 69 1 0 18 2007 58 0 0 19 2007 71 1 1 20 2007 66 1 0 21 2007 68 0 0 22 2007 65 1 1 23 2007 73 0 0 24 2007 62 1 0 25 2007 57 1 1 26 2007 64 0 0 27 2007 73 1 1 28 2007 51 0 1 29 2007 65 0 0 30 2007 54 0 0 31 2007 51 0 0 end tempfile wave1 save `wave1' *** Wave 2 - Fresh sample clear input int id float(year age) byte sex float parent_alive 3863 2009 50 1 0 3864 2009 62 0 0 3865 2009 55 1 1 3866 2009 67 1 0 3867 2009 68 0 0 3868 2009 57 0 1 3869 2009 55 1 1 3870 2009 61 0 0 3871 2009 75 1 0 3872 2009 56 0 0 3873 2009 50 0 1 3874 2009 59 0 0 3875 2009 51 0 0 3876 2009 68 1 1 3877 2009 53 1 1 3878 2009 54 0 0 3879 2009 68 0 0 3880 2009 67 1 0 3881 2009 73 1 0 3882 2009 65 0 0 3883 2009 58 1 0 3884 2009 75 1 0 3885 2009 57 1 1 3886 2009 52 0 0 3887 2009 50 1 0 3888 2009 52 1 0 3889 2009 69 0 0 3890 2009 59 0 0 3891 2009 58 0 0 3892 2009 73 0 0 end tempfile wave2_fresh save `wave2_fresh' *** Wave 3 - Fresh sample clear input int id float(year age) byte sex float parent_alive 5303 2011 59 0 0 5304 2011 54 0 0 5305 2011 71 1 0 5306 2011 59 1 0 5307 2011 59 1 1 5308 2011 52 1 1 5309 2011 62 0 0 5310 2011 75 0 0 5311 2011 60 1 1 5312 2011 62 0 0 5313 2011 69 1 0 5314 2011 57 1 1 5315 2011 71 0 0 5316 2011 60 0 0 5317 2011 63 1 1 5318 2011 55 0 0 5319 2011 51 0 1 5320 2011 54 1 0 5321 2011 67 0 0 5322 2011 66 1 0 5323 2011 67 0 0 5324 2011 63 0 0 5325 2011 69 0 0 5326 2011 74 1 0 5327 2011 70 0 0 5328 2011 68 0 0 5329 2011 57 0 1 5330 2011 55 0 0 5331 2011 58 1 1 5332 2011 52 1 1 end tempfile wave3_fresh save `wave3_fresh' *** Follow-up of wave 1 clear input int id float(year age sex parent_alive) 1 2009 64 1 1 2 2009 77 1 0 3 2009 60 1 0 5 2009 55 0 1 6 2009 67 1 0 8 2009 56 1 0 9 2009 66 0 0 12 2009 68 1 0 13 2009 70 1 0 15 2009 59 1 1 16 2009 61 1 1 17 2009 72 1 0 18 2009 60 0 0 19 2009 73 1 1 20 2009 68 1 0 21 2009 70 0 0 22 2009 67 1 1 25 2009 59 1 1 26 2009 66 0 0 27 2009 75 1 1 29 2009 67 0 0 30 2009 57 0 0 31 2009 54 0 0 34 2009 72 1 0 36 2009 62 1 1 38 2009 69 0 0 39 2009 71 0 0 42 2009 61 1 0 43 2009 61 0 1 45 2009 54 0 1 end tempfile wave1_followup save `wave1_followup' *** Follow-up of wave 1 + wave 2 clear input int id float(year age sex parent_alive) 1 2011 66 1 1 2 2011 79 1 0 3 2011 62 1 0 5 2011 57 0 1 6 2011 69 1 1 8 2011 58 1 1 9 2011 68 0 0 13 2011 72 1 1 15 2011 61 1 1 16 2011 63 1 1 17 2011 74 1 0 18 2011 62 0 0 19 2011 75 1 1 20 2011 70 1 0 21 2011 72 0 0 22 2011 69 1 0 25 2011 61 1 1 26 2011 68 0 0 27 2011 77 1 1 29 2011 69 0 0 30 2011 59 0 1 31 2011 56 0 1 34 2011 74 1 0 36 2011 64 1 0 38 2011 71 0 0 39 2011 73 0 0 42 2011 63 1 0 43 2011 63 0 1 45 2011 56 0 1 46 2011 59 1 0 end tempfile wave2_followup save `wave2_followup'
0 Response to Did I set up my panel data (with restrictions) correctly?
Post a Comment