Hi everyone,

I have a panel data on which I am trying to perform DID. The difficulty I am facing is that my treatment variable is applied at multiple time periods.

As a result, I can't compare Year X (post-treatment) - Year Y (pre-treatment) as my treatment does not occur at a set year.

I was hoping if someone could please help me with the code for this example dataset I created.


Question: In the following dataset, how can I calculate the difference between post-treatment Income and pre-treatment Income, so that I can use that in my DID model.


Information: Using propensity score matching (psmatch2), and controlling for Asset, I want to perform a Difference-in-Difference to see how mergers (indicated by YMerge) affect firm's Income. The treatment group are those firm-years that experience a merger, the control group are the matched firms. As a rule, the merger (ymerge) always occurs in the next year. I am interested in seeing how Income is affected in the actual year of merger ( ymerge[_n+0] OR year[_n+1] ) and 1 year after the merger takes place ( ymerge[_n+1] OR year[_n+2] )


Dataset:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int(id year asset income ymerge)
111 2000  10  10 2001
111 2001  30  40    .
111 2002  50  90    .
111 2003  50 100 2004
111 2004  90 120    .
111 2005 110 190    .
333 2000  15  10 2001
333 2001  20  45 2002
333 2002  60  90    .
333 2003  80 110 2004
333 2004 125 160    .
333 2005 175 240 2006
333 2006 190 290    .
333 2007 240 380    .
555 2000  40  10    .
555 2001  45  20 2002
555 2002  75  85    .
555 2003 130 195    .
555 2004 140 215    .
end
Easier to read:





For reference:

I basically want to replicate the same procedure found in the following tutorial:

What they did is they calculated 'd_earn' which is the difference between re78 (real earnings for 1978) and re75 (real earnings for 1975) for each observation.

Then they included d_earn in their matched psmatch2 DID model to calculate ATT:




I am not being able to replicate their procedure because they have a cross-sectional dataset consisting of 2 years (technically 3) where there is a clearly defined pre-treatment period and a post-treatment period.

However in my panel dataset because the treatment occurs in multiple periods over time, I cannot clearly define a pre and post treatment period the way they have managed to.

Here is the link to the psmatch2 DID reference file I attached here if anyone wanted to take a look at it. It's a really nice step-by-step tutorial:

https://www.empiwifo.uni-freiburg.de...g_solution.doc