Hello!

This inquiry stems from a conversation I had with Clyde a few months ago (see link below).

https://www.statalist.org/forums/for...in-differences

I am evaluating a government intervention in New York City. I have crime data, by month, over a five year period (Jan-2012, Feb-2012, Mar-2012, ..., Nov-2016, Dec-2016, etc.) for all precincts in the city. I am estimating several models using a classical difference-in-differences (DD) framework and then extend this to a more general setting ("generalized" DD) when I evaluate different iterations of the program carried out over several years.

The classical DD model interacts a treatment indicator (i.e., 1 for the treatment group, 0 for the control group) and a "time" indicator indexing the post-treatment period in both experimental groups following implementation of the intervention. All outcomes are 'relatively' right skewed crime rates. When clustering on "precincts" my standard errors deflate considerably, even with clusters in excess of 70. I experimented with other covariance estimators (newey) robust to heteroskedasticity and autocorrelation. There is likely to be serial correlation "within" clusters.

Re-estimating DD models using HAC standard errors inflate my standard errors, though I feel I am only using this approach to be more conservative. Is it inappropriate to ignore the clustering? Should I report both? I feel I need to cluster because of the experimental design; precincts were selected to receive the intervention due to an idiosyncratic crime surge.

I worry because cluster robust standard errors deflate my standard errors and lead to significant program effects, whereas HAC estimators move in the opposite direction.

Any thoughts or similar experiences out there?