I am trying to conduct an interrupted time series analysis with an intervention group and a control group and count (Poisson) data.
My dependent variable is hospital admission rates, and I am considering this to be Poisson distributed (using count of admissions, called 'freq', and log(population) as an offset - my population is in the variable 'dementia').
I have a variable "index_num" to identify intervention and control groups (zero for control, one for intervention), and each has 60 monthly timepoints (in the variable 'timepoint' going from 0 to 59).
I would like to estimate Newey-West standard errors to account for autocorrelation. I would normally include a bunch of other variables to make the interrupted time series comparisons (e.g. level change, slope change after intervention), but here I am going to show a model without additional variables to demonstrate my issue as simply as possible because it doesn't seem to matter how many other variables are in the model.
If I 'tsset' for panel data (panel being intervention and control counts) and try to run a poisson regression with newey-west errors on the panel data as follows:
Code:
tsset index_num timepoint glm freq, family(poisson) link(log) vce(hac nwest) exposure(dementia) eform
I am sure this is because I do have repeated time values (one time series each for intervention and control). I have searched around on the internet and seen a few people mention similar issues but no solution so far.
If I run the same regression on just one of my groups (e.g. the intervention group) as follows, there are no problems:
Code:
keep if index_num == 1 tsset timepoint glm freq, family(poisson) link(log) vce(hac nwest) exposure(dementia) eform
I think that GLM can't do the Newey-West errors with panel data.
Does anyone have any solution to this issue?
I have thought of a couple of options but can't figure out how to do this:
1. Somehow differencing the intervention and control groups to get a single time series (however, the intervention and control have different underlying populations so I'm not sure how I could appropriately subtract one from the other?)
2. Find a way to program my own kernel for the vce(hac ...) option that works with panel data - I'm not an expert mathematician so I'm not sure what the kernel should look like for panel data Newey-West errors, or how I pass this into the Stata command.
Any help with the above or any other possible solutions would be very welcome.
Thanks in advance for any help.
Tim
0 Response to Controlled interrupted time series with count data using glm, panel data, and Newey-West standard errors
Post a Comment