
This is the first time I post question here. I've read pretty much all the posts on Poisson regression but I'm still confused why upon my data, Log-OLS, Poisson with Robust SE, and Negative Binomial (also with robust SE) are giving different results. The signs of the key coefficient are different. The significance levels are also different.

I'm studying the effect of a set of policies on the number of patents each firm generates. The policies are launched in different states in different years so I use diff-in-diff identification strategy (one firm only appears in one province). The baseline model is the following: Yit = Treatedit + states FEs + year FEs + controls. Y is the number of patents. Treated is the diff-in-diff variable.

Let me be a little bit clearer about the Y variable. I'm using an unbalanced panel of many patenting firms. When a firm is not patenting in a certain year, its Y is 0. When it patents, its Y is the number of patents it files. Below is the structure of this variable.

application |
       _num |      Freq.     Percent        Cum.
          0 |     38,728       75.44       75.44
          1 |      7,435       14.48       89.92
          2 |      2,224        4.33       94.25
          3 |      1,006        1.96       96.21

       1011 |          1        0.00       99.98
       1144 |          1        0.00       99.98
       1154 |          1        0.00       99.98
       1166 |          1        0.00       99.98
       1468 |          1        0.00       99.99
       1651 |          1        0.00       99.99
       1699 |          1        0.00       99.99
       2067 |          1        0.00       99.99
       2254 |          1        0.00       99.99
       2479 |          1        0.00      100.00
       3344 |          1        0.00      100.00
       5608 |          1        0.00      100.00
As you can see, I have both many zeros and many large "outliers."

I got started by estimating the model with Log-OLS. That is, I log the Y variable. The results look like this (I'm only showing the upper part to save space):

Linear regression                               Number of obs     =     50,052
                                                F(91, 9362)       =          .
                                                Prob > F          =          .
                                                R-squared         =     0.4442
                                                Root MSE          =     .42304

                                        (Std. Err. adjusted for 9,363 clusters in firm_id)
                         |               Robust
     application_num_log |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                 treated |     .02698   .0074484     3.62   0.000     .0123795    .0415806
                     ppp |  -.0593712   .0185295    -3.20   0.001    -.0956931   -.0230492
                  reward |  -.0714084    .016627    -4.29   0.000     -.104001   -.0388159
          employment_log |   .0214589   .0040019     5.36   0.000     .0136142    .0293036
        total_profit_log |   .2982971   .1340681     2.22   0.026     .0354945    .5610997
        total_assets_log |   .0164814   .0032218     5.12   0.000      .010166    .0227967
          cum_claims_log |   .2868466   .0078559    36.51   0.000     .2714474    .3022459
                     age |  -.0365134   .0034587   -10.56   0.000    -.0432932   -.0297336
Then I tried Negative Binomial (using nbreg command). The results are like this (again, the coefficients for the FEs are omitted to save space):

Negative binomial regression                    Number of obs     =     50,052
                                                Wald chi2(91)     =          .
Dispersion           = mean                     Prob > chi2       =          .
Log pseudolikelihood = -38083.218               Pseudo R2         =     0.2805

                                        (Std. Err. adjusted for 9,363 clusters in firm_id)
                         |               Robust
         application_num |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                 treated |   .2073176    .064875     3.20   0.001     .0801651    .3344702
                     ppp |  -.2657429   .1499111    -1.77   0.076    -.5595634    .0280775
                  reward |  -.4701472   .1618851    -2.90   0.004    -.7874362   -.1528581
          employment_log |   .0392265   .0191232     2.05   0.040     .0017457    .0767073
        total_profit_log |   .0947855   .0950564     1.00   0.319    -.0915217    .2810927
        total_assets_log |   .1030913   .0153379     6.72   0.000     .0730297     .133153
          cum_claims_log |     .78421   .0119894    65.41   0.000     .7607112    .8077088
                     age |  -.1551658   .0195564    -7.93   0.000    -.1934957   -.1168359
                   _cons |  -8.313501   7.020127    -1.18   0.236     -22.0727    5.445696
                /lnalpha |   -.040108     .08049                     -.1978655    .1176495
                   alpha |   .9606857   .0773256                      .8204802     1.12485
Then I tried Poisson (using the poisson command). Now the problem comes up:

Poisson regression                              Number of obs     =     50,052
                                                Wald chi2(91)     =          .
Log pseudolikelihood = -48064.653               Prob > chi2       =          .

                                        (Std. Err. adjusted for 9,363 clusters in firm_id)
                         |               Robust
         application_num |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                 treated |   -.016993   .1490338    -0.11   0.909     -.309094    .2751079
                     ppp |  -.4801813   .1427598    -3.36   0.001    -.7599853   -.2003773
                  reward |  -.3160758   .2234765    -1.41   0.157    -.7540817    .1219301
          employment_log |  -.0106694    .039562    -0.27   0.787    -.0882095    .0668706
        total_profit_log |  -.0990516   .0746123    -1.33   0.184    -.2452889    .0471858
        total_assets_log |   .1005228   .0399201     2.52   0.012     .0222807    .1787648
          cum_claims_log |   .8592099   .0195339    43.99   0.000      .820924    .8974957
                     age |   -.121652   .0475784    -2.56   0.011     -.214904      -.0284
Just for your information, I now also show you the Poisson model estimated using the glm ... f(poisson) cluster(firm_id) command. It seems to me that the data is moderately over-dispersed.

Generalized linear models                         No. of obs      =     50,052
Optimization     : ML                             Residual df     =     50,002
                                                  Scale parameter =          1
Deviance         =  67060.11252                   (1/df) Deviance =   1.341149
Pearson          =  265176.3593                   (1/df) Pearson  =   5.303315

Variance function: V(u) = u                       [Poisson]
Link function    : g(u) = ln(u)                   [Log]

                                                  AIC             =     1.9588
Log pseudolikelihood = -48970.92686               BIC             =  -474002.4

                                        (Std. Err. adjusted for 9,363 clusters in firm_id)
                         |               Robust
         application_num |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                 treated |  -.0196893   .1584752    -0.12   0.901    -.3302949    .2909164
                     ppp |  -.4128056    .168902    -2.44   0.015    -.7438475   -.0817638
                  reward |  -.3372642   .2281683    -1.48   0.139    -.7844658    .1099374
          employment_log |  -.0054456   .0349616    -0.16   0.876    -.0739691     .063078
        total_profit_log |  -.0989554    .072847    -1.36   0.174     -.241733    .0438222
        total_assets_log |   .1070102   .0336612     3.18   0.001     .0410354     .172985
          cum_claims_log |   .8808825   .0200839    43.86   0.000     .8415189    .9202462
                     age |  -.1906753   .0358942    -5.31   0.000    -.2610266   -.1203241
I've also tried zero-inflated Poisson. But the results do not change much.

Can anyone help me why the Poisson results are sooo different from the results of the Log-OLS and Negative Binomial? What could have cause this?

Thank you so much for your help!!!
