Pooled OLS versus random effects in an extremely unbalanced panel

Hello. I am currently working with a dataset of 2,949 observations that correspond to individuals surveyed across 96 communities over 5 years. While the same 96 communities were followed over 5 years, not the same individuals were necessarily interviewed each year: 60% of individuals are actually only observed once, 30% twice, and none is observed over the 5 years.

I could not find though a specific model or recommendation to follow in this sort of “hybrid” situation (i.e. a repeated cross-section of communities with very few individuals repeated over time or an extremely unbalanced panel of individuals):

Proceed as a repeated cross-section: Pooled OLS with robust standard errors (clustered by individual or clustered by community-year);
Proceed as a panel data model: Random effects model with robust standard errors (clustered by individual or clustered by community);

I would appreciate if anyone has an advice on how to proceed in these situations. I am also considering constructing a pseudo panel (by cohort ages and gender) as a robustness check, although I will end up with about 100 observations.

Thank you.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Pooled OLS versus random effects in an extremely unbalanced panel
Pooled OLS versus random effects in an extremely unbalanced panel

0 Response to Pooled OLS versus random effects in an extremely unbalanced panel

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Pooled OLS versus random effects in an extremely unbalanced panel Pooled OLS versus random effects in an extremely unbalanced panel

0 Response to Pooled OLS versus random effects in an extremely unbalanced panel

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Pooled OLS versus random effects in an extremely unbalanced panel
Pooled OLS versus random effects in an extremely unbalanced panel