This question arose in my class and I was not confident enough to provide a definitive answer:

Keeping it simple:
Imagine a big government survey data set: based on a two stage sample whereby there are four regional strata.
Within each stratum, schools are selected using simple random sampling (for a total of, say, 200 schools).
Then within each school, three classrooms are selected at random.
But neither the data set nor documentation identifies the strata precisely, so it is impossible to know the exact number of schools (N_h) in each stratum's population.

If I set the data as follows:
svyset school [pw=weight], strata(region) || classroom

...the secondary sampling units are ignored for purposes of variance estimation because the FPC option is omitted for the first stage. If the classrooms are homogeneous (e.g., students tracked by test scores) then the variance estimates will be too small for many analyses.

Assuming that the total population of schools is very large what are the implications of setting the FPC to a very small number, such as 0.0001?
svyset school [pw=weight], strata(region) fpc(.0001) || classroom

Is there any harm in forcing consideration of second stage clustering by doing so? It seems that the sampling variance will reflect an assumption of an FPC close to zero, which will imperceivably inflate the variance estimates, which is preferable to ignoring second stage clustering. Am I missing something that would cause more mischief? Are there specific kinds of designs (e.g., with PPS first stage selectin) where this could have unexpected effects?

Any insights welcome!