Dear all,

I have a question that I have been struggling with for some time now even after reading many threads and am finally asking for the help of Statalisters.

I have an unbalanced panel data set (18 years, 36 countries) for my thesis. The data for my control variables is also unbalanced panel.

To give some background: I have scored national policy based on how well they have adopted a particular framework. I have done so by dividing the policy documents into the 20 common or standard sections (the sections are my independent variables) and scored each section on a scale from 0 to 3 (0=no section exists for said country, 1=very low score, 2=medium score, 3=high score which are reported categorically with the base 0) and am trying to estimate the effect of their score on my dependent variable, a development index. My data is collected from 1997-2015. These documents were published in different years for different countries (the earliest being 2000 and the latest being 2009), so what I have done is assigned a 0 for the years prior to the document being published and their score (0, 1, 2, or 3) for the years from when the document was published until 2015. This is similar to what Elkins and Feeny did (source: Meg Elkins & Simon Feeny & David Prentice, 2018. Are Poverty Reduction Strategy Papers Associated with Reductions in Poverty and Improvements in Well-being?. Journal of Development Studies. Vol. 54(2). Pages 377-393).

I have some time invariant independent variables since some country's policy documents did not include 1 or more of the 20 standard sections and thus had a score of 0 for all years. I also think that both the within- and between-country effects are valuable. For this reason I thought to use -xtreg, re- with vce (cluster Country) to account for potential country error correlations.

I tried to justify the use of a random effects model using a Hausman test but I am running 20 different regressions, one for each independent variable, with Bonferroni error corrections (because when I tried to run them all together about half of my independent variables of interest were omitted from the results and I did not see a logical way to group my independent variables). So, I ran Hausman tests for each of the 20 regressions I and got varying results: 7 results pointed to the use of RE, 12 pointed to the use of FE, and 1 I got an error ( -9.39 chi2<0 ==> model fitted on these data fails to meet the asymptotic assumptions of the Hausman test; see suest for a generalized test) that I could not seem to solve with sigamore or xtoverid. I was hoping that I could use the same model for each regression so that I could report my results consistently.

I am hoping I could receive some clarification as to which method would be recommended here, any insight would help. I have seen RE, FE while controlling for Year effects (I have upward linear time trends for each country), I have also seen OLS combine with clustering observations by country (source: Bradley, David & Huber, Evelyne & Moller, Stephanie & Nielsen, François & Stephens, John. (2003). Distribution and Redistribution in Post-Industrial Democracies. World Politics. 55. Pages 193-228).

Apologies for the length or if it is complicated. Thank you in advance.

Sample of the code I used:
Code:
xtreg DepVar i.Section1 Control1 Control2 Control3 Control4 Control5 Control6 Control7, re vce(cluster Country_n)

xtreg DepVar i.Section2 Control1 Control2 Control3 Control4 Control5 Control6 Control7, re vce(cluster Country_n)

xtreg DepVar i.Section3 Control1 Control2 Control3 Control4 Control5 Control6 Control7, re vce(cluster Country_n)