Hi all,

I am under the impression that the panel format is not appropriate for my study and am strongly leaning toward using a pure time series approach. The panel units are subregions of a larger geographical area (like counties within a state). The panel is long (T>N). Here are my concerns, any advice would be appreciated.

1) When in panel format, my response variable consists of ~97% zeros (corner solutions response), which presents challenges. Nearly all of the non-negative responses occur in a handful of zones out of 100+ total zones. Ok, but I could just use xtpoisson if that were the only challenge.

2) The panel unit has time invariant characteristics that strongly influence the response, which could be taken care of using fixed effects. However, the characteristics that would be taken care of with fixed effects are very strongly spatially correlated, which makes things more complicated.

3) Some of my independent variables are "engineered" or interpolated from sparse data. For example, there are only a few weather stations scattered throughout the region, but every zone needs a daily value for rainfall, temperature, and others. Some of these variables are really unlikely to be realistic for the interpolation, but we have no other choice besides exclusion.

4) I am unsure as to whether the panel unit is even meaningful l or worth the hassle.The data just happens to be tabulated in that format and the subregions are each hydraulically isolated, which may or may not affect the response, we really don't know.

5) One of our critical predictors is only available for 13 out of 100+ zones, which kinda hamstrings the whole panel approach in my opinion.

Am I off the mark?