Hi,

I am working on a nationally representative annual cross-sectional household surveys that is sampled using a 2-stage stratified clustered sampling design. I pool the years from 2008-2011 (4 cross sectional datasets together ) and I specify the svyset command for this datasets based on the sampling documentation that details the following process:

Sampling frame: Population census The selection of households is based on the 2004/2005 PSU "primary sample units" as an update of the 1996 population census
Procedure used to update the sampling frame: Quick Account
Lowest level of geographic disaggregation for which reliable estimates of the unemployment rate can be produced and their
frequency: Shiaka/Village (quarterly)
The sample is stratified: Yes
Variables used for stratification: geographic region (gov & urban/ rural)
Primary Sampling Units(PSU): Sample Area based on Skhiakha or village
Number of sampling stages: 2
Ultimate sampling units: households
Number of ultimate sampling units per sample area: 70 Households is selected in each selected PSU in stage 1 (size of cluster)
Sample size: 21000 ultimate sampling units per quarter
Sample rotation takes place: at the ultimate sampling unit level only
The rotation system results in: the overlap between consecutive survey periods and the overlap between same periods one
year apart
Percentage of ultimate sampling units remaining in the sample for two consecutive survey rounds: 33.3%
Maximum number of times an ultimate sampling unit is interviewed: 3
Months needed to renew the sample completely: 9S



Based on the above I used the following code :


egen strata = concat(gov urban) // This creates a a variable for the strata (gov) and substratum (urban/rural) as per stratification process of sample.
egen super_strata = group(yr strata) // in appending multiple years, since the psu are independent and the psu_id is created for each year sperately.


then the following svyset command;

svyset psu [pweight=pw], singleunit(centered) strata(super_strata) || hhid

I have the following questions;

1. In the dataset, only one weight is provided( pweight) and for each governonate (locality) I noticed there are 4 weights ( 1 for male urban , 1 for male rural and 1 for female urban and for female rural) , so it seems that there has been postratification by sex that is adjusted for already in the pweight. How can I include that postratification in the design when I only have one weight?

2. Is the above svyset command reflecting correctly the sampling design as per documentation above?

3. Do I need to add the fpc in this case ? I don't have a variable clearly for fpc , so any suggestions of how to calculate that?

Many thanks in advance,

Abeya Mokhtar