Hello all,

I kindly want to preface this post by saying I am new to STATA and have referred to other threads regarding similar issues but am still at loss--in any case, I apologize in advance for any obvious/poorly phrased questions.

I am working with a sample of just over 1900 observations of 183 firms, with data ranging from fiscal years 1996-2018. I suppose the panel is unbalanced because some firms encompass data from fiscal years spanning from 1996 through 2018, while others spanned only one, or a few, fiscal years (this was the sample I was provided). The goal is to compare the performance of different types of firms (indicated by a dummy variable).

I took the first step with the following commands:
. xtset gvkey fiscalyear, yearly
. xtdescribe
. tsfill

The gvkey is the code used to identify each firm.

Then I performed the Breusch-Pagan LM test for random effects versus OLS model, rejection of null indicated RE instead of pooled OLS.
Following this I did the Hausman test for fixed versus random effects model, rejection of the null indicated FE instead of RE.

Great, so now I choose to use the following regression: (for simplification, I cut out the performance measure, the variable of intereste, and all the controls)
. xtreg y x, fe

Naturally, I want to check the regression for auto-correlation and heteroskedasticity. A previous test I used on reg xy, "estat imtest, white", indicated there was heteroskedasticity, and "(. xtserial x y)", indicated my results had serial correlation

I used "xttest3" after the regression, to find there is heteroskedasticity. BUT, when I tried to use "xttest2", I got the following response: Error: too few common observations across panel.
no observations. I tried to use " xtcsd, pesaran" instead, because I thought it would work with an unbalanced panel, but also got the response "Error: The panel is highly unbalanced.
Not enough common observations across panel to perform Pesaran's test. insufficient observations". So I am not sure how to address the issue of the unbalanced panel in order to test for auto-correlation after using a fixed effects model.

In any case, (assuming there is auto-correlation), I proceeded to use: ". xtreg x y, fe vce(robust)" and "xtreg x y, fe vce(cluster gvkey)" as a remedy (gvkey is the code used to identify each firm) How can I interpret the results of these regressions to know which option is better? As far as I can tell, the results are the same, and both are indicative of a fe model.

So in sum:
1) how can I test for auto-correlation of a fe model when I am running into these issued with an unbalanced panel. I want to be able to justify using the following models?
2) which of the two options, vce(robust) or vce(cluster gvkey), yields more robust results (if either)? It seems to depend on a case by case basis, so I kindly wanted to ask if you all had any recommendations.

If there's any other information I can provide, please let me know! A big thank you in advance to anyone who can help!

Best,

Evi