I am working with some data about hospitals in England who are submitting information about the number of COVID-19 patients they are treating. Each day each hospital should supply information on several different element of treatment.

The problem is that some hospitals miss submissions on some days (true) missing, whereas some hospitals submit a false zero for some of the data items on some of the days.

I want to create a national time-series adjusted for these missing data and false zeroes. Can anyone suggests if there is a recommended approach for this type of issue please?

At the moment I am replace zeros with missing using a semi-manual process that is far from ideal. I'm then inputting missing values using a polynomial if the hospital has more than 4 observations in total and using
Code:
poisson y c.day##c.day i.hospital
otherwise.

Many thanks for any insight you can offer.

Rob