Hello,
I will preface by saying the data i am using is confidential so i cannot share the whole file but will share what i can, I am using Stata IC 16.0

I am wanting to perform a difference-in-difference regression around a reform in 2007. So i have a sample of males from 2006-2017, each with a personal identifying number, the sample is large (approx 60,000) but i am only interested in males who had a child in 2006 or 2007 (Child dummy variable =0 for no child born that year or =1 for child born that year).

So i would like to drop all other males but using "keep if Child>0 & year<2008" will drop all of the observations from 2006-2017 for males who did have a child in 2006/7 as Child=0 for the years following 2006/7 but i want to keep these.

I considered generating a dummy variable called Child2006 (gen Child2006=1 if Child>0 & year<2007) but I want it to take a value of 1 for the male for every year, not just 2006.

Is there a way this is possible?


Here is an example of my data

persnr year Child
1231802 2006 0
1231802 2007 1
1231802 2008 0
1231802 2009 0
1231802 2010 0
1231802 2011 0
1231802 2012 0
1231802 2013 0
1231802 2014 0
1231802 2015 0
1231802 2016 0
1231802 2017 0
1231803 2006 0
1231803 2007 0
1231803 2008 0
1231803 2009 0
1231803 2010 0
1231803 2011 0
1231803 2012 0
1231803 2013 0
1231803 2014 0
1231803 2015 0
1231803 2016 0
1231803 2017 0


Thank you