This question arose in my class and I was not confident enough to provide a definitive answer:
Keeping it simple:
Imagine a big government survey data set: based on a two stage sample whereby there are four regional strata.
Within each stratum, schools are selected using simple random sampling (for a total of, say, 200 schools).
Then within each school, three classrooms are selected at random.
But neither the data set nor documentation identifies the strata precisely, so it is impossible to know the exact number of schools (N_h) in each stratum's population.
If I set the data as follows:
svyset school [pw=weight], strata(region) || classroom
...the secondary sampling units are ignored for purposes of variance estimation because the FPC option is omitted for the first stage. If the classrooms are homogeneous (e.g., students tracked by test scores) then the variance estimates will be too small for many analyses.
Assuming that the total population of schools is very large what are the implications of setting the FPC to a very small number, such as 0.0001?
svyset school [pw=weight], strata(region) fpc(.0001) || classroom
Is there any harm in forcing consideration of second stage clustering by doing so? It seems that the sampling variance will reflect an assumption of an FPC close to zero, which will imperceivably inflate the variance estimates, which is preferable to ignoring second stage clustering. Am I missing something that would cause more mischief? Are there specific kinds of designs (e.g., with PPS first stage selectin) where this could have unexpected effects?
Any insights welcome!
Related Posts with SVY with secondary sampling units when FPC is unknown
Rendering of Stata homepage in MS EdgeI just noticed that my company browser doesn't render the Stata homepage properly. There is a huge "…
Compare observation to previous ones: Patent DataHello everyone I have an extensive dataset about individual patents (patnum), belonging to a techno…
Generating dummy variables using non-numerical dataHello. I am working with survey data where participants are asked if they agree with a statement, in…
e(r2_p) as a weight on a loop of regressionsHello everyone, I would like to run a loop of 10 logit regressions following more or less the examp…
How to identify observations with special characters, numbers and whitespaces (expect leading and trailing spaces) - string varDear all, I am working with string data and I would like to identify observations that have special…
Subscribe to:
Post Comments (Atom)
0 Response to SVY with secondary sampling units when FPC is unknown
Post a Comment