I am working on a nationally representative data for Egypt and want to set the correct svyset command but not sure how to set it with the following two stage sampling procedure;
"The master sample was extracted through a two-stage process. The country is first divided into two strata: urban and rural. Each stratum is in turn divided into substrata representing
each governorate. All the villages (in the case of rural strata) or shiyakhas (urban quarter in the case of urban strata) in each substratum were listed and assigned a weight based on
their population. The first stage consisted of choosing the villages and shiyakhas that would be represented in the sample based on the principal of probability proportional to
size. This meant that a shiyakha or a village is possibly selected more than once if its size warrants that. The selected shiyakhas and villages are then divided into PSUs of
approximately 1500 housing units each; then one or more PSUs are selected from each shiyakha or village. The selected PSUs were then re-listed in 1995 to enumerate all the
households selected. The master sample contains 306 urban PSUs and 194 rural PSUs. For the survey sample I am working with , 200 PSUs were selected from the master sample, on the basis of the number shown in Table 1. The desired number of PSUs in each substratum was selected from the number available in the master sample using a systematic interval. Cairo
and Alexandria were deliberately over-sampled and rural areas under-sampled to increase the probability of obtaining women wage-workers in the private sector, which tend to be
concentrated in Metropolitan areas. A self-weighted sample would have yielded too few observations for these important sects of the population. "
The variables I have are the following;
1. PSU which has 200 unique values : psu
2. Probability weight : expan
3. governorate ( which is like a state ) that has 22 unique values for each governorate : gov
4. urban/ rural variable that is dummy : urban/rural
5. shiyakha /village variable that has 26 different variables : Shiyakha
The villages or shiyakha some are urban and some are rural .
Sorry for the long description but wanted to give enough information to help me. I am just lost on how to include those stages and the correct interpretation of the process into the svy command and not sure whether I should use the urban/rural , government or shiykha or all .
Thanks in advance for helping, I have looked already in the past posts but didn't find something similar to the these stages.
Related Posts with Two-stage s
how to create tempfile named according to value of a local in a loop?Hi! I have 19 files, and want Stata to create a tempfile for each one, but named differently. How m…
considering inter eyes correlation in spearman analysis(mixed model)Hi I am working on retinal thickness of 68 eyes measured by 2 devices. I calculated the spearman co…
Nonparametric Linear regressionHello everyone, I have a dataset whose dependent variable is heavily skewed to the left. The indepe…
linear hurdle modelsDear statalists I need your help regarding hurdle models, please. Particularly I am interested on 1…
Sum distanct values of a variable by groupHey everyone, I have a very easy question, but I am just not solving it. Let's suppose I have two d…
Subscribe to:
Post Comments (Atom)
0 Response to Two-stage s
Post a Comment