Hi

I have a question regarding to use of fe and clustered se. The majority of the threads on statalist with this topic was based on panel data.
My data set does not have a conventional panel data structure.

Small snapshot of my data:
Code:
 invt_id    patent    appyear    invt_network_size    teamsize    cbsacode    providers    internetdummy
03858276-1    06185789    1999    17    2    24860    0    0
03858299-1    06957483    2003    7    1    33780    7    1
03858315-1    06584696    2002    17    1    14860    10    1
03858315-1    06317990    1999    17    1    14860    2    1
03858315-1    06393706    2001    17    1    14860    10    1
03858390-1    06918569    2003    13    1    38060    0    0
03858390-1    06931831    2003    13    1    38060    0    0
03858390-1    07155896    2004    14    2    38060    0    0
03858390-1    06786236    2003    13    1    38060    0    0
03858390-1    07240695    2003    13    1    38060    0    0
03858390-1    06783108    2002    13    1    38060    0    0
03858390-1    07384248    2004    14    1    38060    0    0
03858390-1    07527068    2004    14    1    38060    0    0
03858390-1    06390129    2000    13    2    38060    0    0
03858390-1    06250602    2000    13    1    38060    0    0
03858390-1    06371740    2000    13    1    38060    0    0
03858401-1    06244347    1999    13    1    26420    0    0
03858401-1    06173782    1999    13    1    26420    0    0
I'm not sure if i should cluster on cbsa when running a nbreg, the adjustments of clustering se highly influence the interpretation of my results. I've done both with and without

1) without

Code:
 
 nbreg teamsize internetdummy invt_network_size i.cbsacode i.appyear, vce(robust)  Negative binomial regression                    Number of obs     =    462,187                                                 Wald chi2(497)    =          . Dispersion           = mean                     Prob > chi2       =          . Log pseudolikelihood = -851639.86               Pseudo R2         =     0.0225  -----------------------------------------------------------------------------------                   |               Robust          teamsize |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval] ------------------+----------------------------------------------------------------     internetdummy |  -.0138526   .0022496    -6.16   0.000    -.0182618   -.0094434 invt_network_size |   .0094635   .0001072    88.32   0.000     .0092535    .0096735                   |
2) with clustering
Code:
 
 nbreg teamsize internetdummy invt_network_size i.cbsacode i.appyear, vce(cluster cbsacode)  Negative binomial regression                    Number of obs     =    462,187                                                 Wald chi2(6)      =          . Dispersion           = mean                     Prob > chi2       =          . Log pseudolikelihood = -851639.86               Pseudo R2         =     0.0225                                    (Std. Err. adjusted for 495 clusters in cbsacode) -----------------------------------------------------------------------------------                   |               Robust          teamsize |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval] ------------------+----------------------------------------------------------------     internetdummy |  -.0138526   .0092945    -1.49   0.136    -.0320695    .0043643 invt_network_size |   .0094635   .0006227    15.20   0.000      .008243     .010684
From my research on the internet and this forum I've come with the following results:
- the formerly way of thinking about clustering se is that if you assume correlations within cluster and therefore are not iid (which I assume because within the clusters several observations have the same inventor), you should cluster anyway.

- The paper 'WHEN SHOULD YOU ADJUST STANDARD ERRORS FOR CLUSTERING?' by Athey, Abadie, Imbens and Wooldridge contradicts this way of thinking and focuses on the design problem. I find it difficult to interpret if either the sampling or assignment to treatment was clustered.

I know there is still a lot of discussion about fe and clustering se.

Could anyone perhaps clarify for my specific situation?

Thanks.
Ludo