Dear Statalisters,

I am working on a household survey with pooled cross-sectional data with the following characteristics:

The first stage was the identification of enumeration areas/PSUs and then from each PSU, households were randomly selected in the second stage. For each household and survey year, data was collected on all household members.
  • I suspect that the PSUs are mostly the same over time with slight modifications/additions but there is no way I can confirm this.
  • Each wave has its own stratification but all the stratifications are mainly regional.
  • Each wave has its own sampling weight.
  • year 1 has 368 PSUs and 10 strata
  • year 2 has 404 PSUs and 10 strata
  • year 3 has 697 PSUs and 34 strata
I think declaring the data as survey data for the main regressions is not really important if I would be adding time dummies.

However, I would like to svyset the data for descriptive analysis (both graphical and tabular) but I am not sure how best to correctly do this. My aim is to present a mix of individual and household characteristics as well as to conduct subgroup analyses.

I have searched previous posts but did not find a well-explained solution to my predicament. Referring to these posts below: I have tried something along these lines: svyset psuXyear [pw=weight], strata[strataXyear] but I keep getting an error "too many weights". I have tried dividing the weights by 3 since I have 3 periods but nothing changed

Thank you for your anticipated response.