I'm performing some analyses were I need to draw a stratified two-stage sample from some population data. The data (apipop from UCLA's stats site) are at the school-level and schools are nested within districts. I first divide the data into two strata. In the first stratum, I sample ~50% of districts (189/377). In the second stratum, I sample 25% (95/380). Then, I randomly sample 3 schools per sampled district. If a district has less then 3 schools, schools are sampled with certainty. I then calculate sampling weights by taking the product of the inverse of each stage's sampling fraction. When I apply the weight (using svyset) and estimate the mean of a variable, the population size (as listed in the output) should be equal the total number of schools that I started with (6194), but it isn't (it equals 6565). Any idea what I've done wrong? I've been staring at this for far too long!

Here's the code:

Code:
use https://stats.idre.ucla.edu/stat/stata/seminars/svy_stata_intro/apipop, clear

set seed 12345678

// Create strata based on API99. 

egen mean = mean(api99), by(dnum)
g strata = 1
replace strata = 2 if mean > 650

ta strata
codebook dnum

// Sample 50% of districts in stratum 1 and 25% of districts in stratum 2.

frame put dnum if strata == 1, into(s1)
frame put dnum if strata == 2, into(s2)
frame s1: contract dnum
frame s1: count                             
frame s1: sample 50                    
frame s1: count                            
frame s1: g first_stage = 1            
frame s2: contract dnum           
frame s2: count                            
frame s2: sample 25                    
frame s2: count                             
frame s2: g first_stage = 1            
frlink m:1 dnum, frame(s1)
frlink m:1 dnum, frame(s2)
frget first_stage1=first_stage, from(s1)    // Link original data to frame s1
frget first_stage2=first_stage, from(s2)    // Link original data to frame s2

// Keep only the districts selected in the first stage; drop the rest

keep if first_stage1 == 1 | first_stage2 == 1  

// Randomly sample 3 schools from each of the selected districts. 

bysort dnum: g random = runiform()      
sort dnum random                       
bysort dnum: g number = _n  
bysort dnum: g NN = _N                 
drop if number > 3                   
bysort dnum: g nn = _N                 

// The sampling fraction for the first stage was 189/377 in stratum 1 and 
// 95/380 in stratum 2. In the second stage we sampled nn (usually 3) out of
// NN in a district. Take the inverse of these sampling fractions and then multiply 
// to create the sampling weight, pw.

g double p1 = 377/189 if strata == 1      
replace p1 = 380/95 if strata == 2
g double p2 = NN/nn
g double pw = p1*p2

// Pass design information to Stata 

svyset dnum [pweight = pw], strata(strata) fpc(fpc) 

svy: mean api00