I am evaluating a government program using difference in differences. The program was implemented in 2014 and I have administrative records for the outcome variable from 2010 to 2018 (except for 2014). First I evaluated the program using only 2 years: 2013 (before the program) and 2015 (after). My regression looks like this
Code:
regress score treatment post post##treatment X
where post indicates if the observation is from 2015 and X is my control vector. Now I want to include all years, so first I created a dummy for after the program like this:
Code:
gen byte post=(dummy2015==1 | dummy2016==1 | dummy2017==1 | dummy2018==1)
and I was about to run a regression like the one before but including interaction terms of treatment and dummy for every year, post and dummy for every year and treatment post and dumy for every year, but I noticed that, for example, for all years before the treatment, the interaction between post and the year dummy does not make sense, since an observation from 2013, 2012, 2011 or 2010 will never have one in post (by definition), and then there might be some other problems of collinearity with the rest of the variables, or not?
How should I be running this regression? I think I am asking for the specification of difference in differences with multiple periods
Second, I read in Mostly Harmless Econometrics that a good way to test the identification is using a Granger test, and it says that the test consists of making sure that leads do not matter in an equation that contains interactions of treatment and dummies for years before the program and then interactions of treatment, years after the program and control variables (triple interactions), what does this mean?
Finally, is that really a good way of testing the identification? If not, could you tell me, or give me a reference where I can find out how to test my identification strategy?
0 Response to Generalized difference in differences problem
Post a Comment