Hello everyone, I hope you are well!

I'm running a regression which examines the impact of the proportion of female directors on the number of M&A deals a firm undertakes in a year, which is my dependent variable.

Due to my DV being count data, a form of Poisson analysis seems appropriate. At first glance of my data, from comparing the mean and variance of the number of deals which take place, over-dispersion seems present. This seems to hint a negative Binomial distribution is more appropriate than a Poisson regression.

However, to further justify the use of a NB regression over a Poisson regression, I have conducted a deviance and Pearson goodness-of-fit test in Stata.




Deviance goodness-of-fit = 3758.829

Prob > chi2(4775) = 1.0000

Pearson goodness-of-fit = 6002.96

Prob > chi2(4775) = 0.0000




I'm quite confused by the incredibly conflicting results. I also have a very high frequency of zeroes for my dependent variable, since there are a lot of years where firms do not undertake any deals. Would this potentially be contributing to the conflicting results?

I have also read that, in general, these two tests may not actually be that great in determining goodness of fit for the Poisson model. With this being the case, I have also ran my negative binomial regression and established that the alpha parameter is significantly different from zero, reinforcing that a Poisson regression is not an appropriate model. Apart from this finding, and the variance appearing greater than the mean, are there any other sensible ways in which one could justify the use of a Negative Binomial regression over a Poisson?

For example, would the use of Bayesian Information Criteria be appropriate. The BIC for the Poisson is 5808.791meanwhile I suppose it is only marginally lower at 5690.544 for the NB.

Thank you greatly for any insight!