This question arose in my class and I was not confident enough to provide a definitive answer:
Keeping it simple:
Imagine a big government survey data set: based on a two stage sample whereby there are four regional strata.
Within each stratum, schools are selected using simple random sampling (for a total of, say, 200 schools).
Then within each school, three classrooms are selected at random.
But neither the data set nor documentation identifies the strata precisely, so it is impossible to know the exact number of schools (N_h) in each stratum's population.
If I set the data as follows:
svyset school [pw=weight], strata(region) || classroom
...the secondary sampling units are ignored for purposes of variance estimation because the FPC option is omitted for the first stage. If the classrooms are homogeneous (e.g., students tracked by test scores) then the variance estimates will be too small for many analyses.
Assuming that the total population of schools is very large what are the implications of setting the FPC to a very small number, such as 0.0001?
svyset school [pw=weight], strata(region) fpc(.0001) || classroom
Is there any harm in forcing consideration of second stage clustering by doing so? It seems that the sampling variance will reflect an assumption of an FPC close to zero, which will imperceivably inflate the variance estimates, which is preferable to ignoring second stage clustering. Am I missing something that would cause more mischief? Are there specific kinds of designs (e.g., with PPS first stage selectin) where this could have unexpected effects?
Any insights welcome!
Related Posts with SVY with secondary sampling units when FPC is unknown
Multiple imputationDear all, If i am carrying out multiple imputation as a result of missing data how do i go about it…
Keeping data for last date of each month for each companyHi I am working with daily stock price data for a large number of companies. I want to keep the las…
Creating custom table with statisticsHello, I would like to create a table, like an asdoc table, but with some additional statistics. For…
Two-Sample KS testGood day, I have a sample of European countries and I'm analyzing bank profitability for EU banks w…
Data cleaning - drop all observations for ID if one observation in var x contains xyzI am having a small problem. I am trying to clean my data set: I would like to remove all observati…
Subscribe to:
Post Comments (Atom)
0 Response to SVY with secondary sampling units when FPC is unknown
Post a Comment