Hello everyone,

I am estimating the effect of a law change on certain policies for the US states. The time period for my dataset ranges from 1980-2017. My treatment variable is a binary variable that indicates the presence of law if it takes the value of 1.

Within my dataset, I have states that never undergo treatment, i.e., control group. Then a treatment group (with heterogeneous treatment timing or staggered adoption). Third, I also have always-treated units for which the treatment takes the value of 1 throughout the sample period; you can also think of always-treated states for which treatment occurred before the start of my sample in 1980. And finally, there are relatively few states for which treatment status switches on and off multiple times (sometimes twice, sometimes more than twice) across the sample period.

The model which I am estimating takes the following form:

Outcome Variable = b_0 + b_1*StateFE + b_2*YearFE + b_3*Treatment + b_4*(Set-of-Controls) + error

STATA code:
xtreg outcome_var treatment_var x1 x2 i.year, fe vce(cluster state)

Now what has been a bit confusing for me are the following points:

(1) Should I only drop Always Treated States from the regression analysis? Will that still be a TWFE/Gen. DiD model?

(2) Should I only drop States that undergo treatment reversals, i.e., for which treatment status switch on and off?

(3) Should I do both (1) and (2)?

(4) Run the model without dropping anything? In this case, would it still be considered as the Gen. DiD model?

(5) Finally, I recently read Goodman-bacon's (2020) paper on the TWFE model being biased. His decomposition theorem gives us the weighted average of all the different 2x2. I used his STATA package "bacondecomp" to estimate my model with includes all the States, except those that undergo treatment switching on and off (the model cannot be estimated with these states included). Here are the results:

Outcome variable Coefficient SE z P > |z| 95% CI
treatment variable
.061513
.035459
1.73
0.083
-.0079854 .1310115
Beta Total Weight
Timing_groups
.0605793393
.119811621
Always_v_timing
.0572147966
.814743692
Never_v_timing
.0679883642
.0542594178
Always_v_never
-2.49957943
.0542594178
Within
.3758984804
.0110969387

Would you please help me interpret this: does the weight on "Always treated group vs. timing group" reflect that my TWFE/Gen? DiD estimates are biased? I am having a hard time interpreting these weights and figuring out if I should or should not drop Always treated States.

Many thanks for considering my request.