Hi,

I am trying to conduct an interrupted time series analysis with an intervention group and a control group and count (Poisson) data.

My dependent variable is hospital admission rates, and I am considering this to be Poisson distributed (using count of admissions, called 'freq', and log(population) as an offset - my population is in the variable 'dementia').

I have a variable "index_num" to identify intervention and control groups (zero for control, one for intervention), and each has 60 monthly timepoints (in the variable 'timepoint' going from 0 to 59).

I would like to estimate Newey-West standard errors to account for autocorrelation. I would normally include a bunch of other variables to make the interrupted time series comparisons (e.g. level change, slope change after intervention), but here I am going to show a model without additional variables to demonstrate my issue as simply as possible because it doesn't seem to matter how many other variables are in the model.

If I 'tsset' for panel data (panel being intervention and control counts) and try to run a poisson regression with newey-west errors on the panel data as follows:

Code:
tsset index_num timepoint
glm freq, family(poisson) link(log) vce(hac nwest) exposure(dementia) eform
I get this error message: "repeated time values in sample r(451);".

I am sure this is because I do have repeated time values (one time series each for intervention and control). I have searched around on the internet and seen a few people mention similar issues but no solution so far.

If I run the same regression on just one of my groups (e.g. the intervention group) as follows, there are no problems:

Code:
keep if index_num == 1
tsset timepoint
glm freq, family(poisson) link(log) vce(hac nwest) exposure(dementia) eform
Alternatively, I can run the regression with both groups and without using Newey West errors (e.g. robust standard errors instead) and this runs fine.

I think that GLM can't do the Newey-West errors with panel data.

Does anyone have any solution to this issue?

I have thought of a couple of options but can't figure out how to do this:
1. Somehow differencing the intervention and control groups to get a single time series (however, the intervention and control have different underlying populations so I'm not sure how I could appropriately subtract one from the other?)
2. Find a way to program my own kernel for the vce(hac ...) option that works with panel data - I'm not an expert mathematician so I'm not sure what the kernel should look like for panel data Newey-West errors, or how I pass this into the Stata command.

Any help with the above or any other possible solutions would be very welcome.

Thanks in advance for any help.

Tim