I'm trying to implement the Sun & Abraham (2020) correction for an event study with heterogeneous effects (http://economics.mit.edu/files/14964 short story: if there is heterogeneity in dynamic effects across cohorts the traditional ES estimation gives uninterpretable results).

The suggest estimating the dynamics separately for each cohort using interactions and then averaging the
t+s effect across cohorts who were treated at least s periods ago. This seemed like a natural application for stata's categorical variable toolkit, but I can't get stata to drop the interactions I want.

I wrote this to omit all of the
`n'.event interactions, but instead it only omits the `n'.event#1.cohort interaction.
Code:
* recode event to all positive values for categorical operator
sum event
local n = -1*`r(min)'
replace event = event + `n'
* recode event to infinity for never_treated
replace event = 100 if never_treated==1
replace cohort = 0 if never_treated==1

areg outcome ib(`n').event#ib(0).cohort i.year, a(id) cluster(id)
I didn't see anything in the manual: https://www.stata.com/manuals13/u25.pdf. Obviously I can code up the needed variables by hand (see below), but (1) that would be a lot of variables in more complicated settings--which is the point of the indicator operators isn't it?--and (2) at a scientific level this leaves me wondering whether I understand what stata is omitting in other contexts.

Thanks in advance!



Here is a full working example, (which by the way shows one of the cool points from the paper, that pre-trends coefs capture information from treated time periods)
Code:
set seed 26012021
clear
set obs 200000
gen id = ceil(_n/20)
bys id: gen year = 1999+_n
gen r1 = runiform()
gen r2 = runiform()

* Who gets treated when
bys id: gen never_treated = r1[1]<.5
by id: egen maxr = max(r2)
by id: egen cohort = min(cond(r2==maxr,year,.))

* Make regression variables
gen event = year-cohort
gen treated = year>cohort & never_treated==0
by id: gen outcome = r1[2] + .1*sin(year/3) + rnormal()
replace outcome = outcome + (2020-cohort)/30*sqrt(event)*treated if event>=0

* bin endpoints
tab event
replace event = -12 if event<-12
replace event = 12 if event>12

* Recode to positive
replace event = event + 13

* recode event to infinity for never_treated
replace event = 100 if never_treated==1
replace cohort = 0 if never_treated==1

* traditional event study
areg outcome ib(13).event i.year, a(id) cluster(id)

* Event coefs
mat A = J(25,4,0)
forval i=1/25{
mat A[`i',1] = [`i' -13 ,_b[`i'.event],_b[`i'.event]+1.96*_se[`i'.event], _b[`i'.event]-1.96*_se[`i'.event]]
}


* How I think I should be able to do S&A2020
areg outcome ib(13).event#ib(0).cohort i.year, a(id) cluster(id)



* My fix by hand
forval i = 1/25 {
forval j = 2000/2019 {
qui count if event==`i' &cohort==`j'
if `r(N)' >0 {
gen e`i'_c`j' = event==`i' &cohort==`j'
}
}
}


areg outcome e1_c2012-e12_c2019 e14_c2000-e25_c2007 i.year, a(id) cluster(id)

mat Q = J(25,20,0)
mat R = J(25,20,0)


forval i=1/25 {
qui count if event == `i'
local den = `r(N)'
forval j = 2000/2019 {
cap mat Q[`i',`j'-1999] = _b[e`i'_c`j']
qui count if cohort ==`j' & event == `i'
local num = `r(N)'
mat R[`i',`j'-1999] = `num'/`den'
}
}

mata : st_matrix("S", rowsum(st_matrix("Q"):*st_matrix("R")))

preserve
svmat A
svmat S
gen S2 = _n-13 if _n<=25

tw (rarea A4 A3 A1, color(gs12)) (connected S1 S2, lc(black) mc(black) mfc(white) msize(1)) ///
(connected A2 A1, lc(black) mc(black) mfc(white) msize(1) lp(-)) ///
, ylab( , format(%3.2f) angle(0)) xlab(-12 "t{&le}-12" -9 -6 -3 0 3 6 9 12 "t{&ge}12") ///
xtitle(" " "Years from Event") ytitle("Effect Size" " ") ///
xsize(8) graphregion(color(white)) legend( order(3 "Standard Design" 2 "Cohort Average (Sun & Abraham)"))