Hello everyone,


I have been doing extensive reading for some time to decide upon some appropriate models for analyzing my data, but I could use some input from others. I have a dataset that examines firms as the unit of analysis over the course of 8 years (t=8, n= ~23000). Each firm has many observations at each time point. The dependent variable is often zero (75-80% of the time), however, these are observed zeros and technically not quite censored values. The dependent variable is observed in dollars. I am inquiring about the following analytical techniques (GEE and Tobit) for my data. I understand they may not be the only models I can employ, but I am curious about their suitability specifically.

For the primary analysis technique, I am considering either GEE or fixed effects Tobit, and I may use both with one or more being used for robustness.

I think xtgee could be a good command being that I could xtset on firm and control for within firm dependence of observations. Also, from reading, GEE appears to stable with regard to distributional assumptions and I believe it does not assume normality among residuals. Would this be one appropriate way to approach my data?

I am also considering FE Tobit. It seems like this could be a good way to go since Tobit can handle many zeros on the dependent variable. However, I understand there is a incidental parameters problem with FE Tobit, but it seems that there is not necessarily agreement on this point. Also, given that I have a large sample size and a reasonable T, it is my understanding that using FE Tobit should not be an issue. Can anyone shed some insight on this? I would love to hear an informed opinion on this point.

For additional robustness, I am looking at running a fixed effects linear model, which I think should be fine. But I am curious to hear others thoughts on xtgee and FE Tobit for my data.

Thank you all in advance!