Dear Statalisters,

I am working with adminstrative data to compare hospital admission rates across 177 hospitals using hierarchical logistic regression. I have calculated predicted and expected probabilities of hospital admission which gives me the risk-adjusted admission ratio for each of the hospitals. I am now trying to bootstrap 95% confidence intervals for each of the PE ratios, but am encountering some problems.

In the code below, the program I have created for the -bootstrap- comamnd does create 95%CI for each hospital; however it is using information from the entire dataset instead of only information from each hospital individually. I would like that each loop only used the data available from each hospital, instead of the entire dataset. I believe this is the right way to generate the confidence intervals for this type of analysis that I'm running (if anybody has a different view on this, please feel free to comment here as well)

Any help/advice is greatly appreciated.

Code:
    quietly {
        * Synthetic dataset that is similar to the actual data
        webuse bangladesh, clear
        rename (c_use age urban child3 district) (sex age_c language_english ATS_urgent hospital_encoded)
        drop child1 child2 children
        generate type_LBP = round(runiform() * 2)
        generate arrival_ambulance = round(runiform() * 2)
        generate afterhours = round(runiform() * 2)
        generate admitted = round(runiform() * 2)
        summarize
    }    

    quietly {
        * Make a program that runs the analysis for a single hospital.
        cls
        capture program drop bsprog
        program define bsprog, rclass
        syntax [, hospNum(integer 1)]

            *Hierarchical regression
            melogit admitted i.sex age_c i.language_english i.ATS_urgent ib(last).type_LBP i.arrival_ambulance ib(last).afterhours || hospital_encoded:, or
            capture drop predicted p_admission expected e_admission ratioPtoE
            *Predicted
            predict predicted, mu
            bysort hospital_encoded: egen p_admission = total(predicted)                        
            *Expected
            predict expected, mu conditional(fixedonly)
            bysort hospital_encoded: egen e_admission = total(expected)
            *Ratio
            generate ratioPtoE = p_admission / e_admission if hospital_encoded == `hospNum'
            sum ratioPtoE
            return scalar PEratio = r(mean)
        end
    }
    
    quietly {
        ** Set up variables to obtain bootstrap CIs for the estimate from each hospital.
        generate hospNum = .
        generate ratio = .
        generate paramLCL = .
        generate paramUCL = .
        generate percentLCL = .
        generate percentUCL = .
        local row = 1
    }
    
    ** Obtain bootstrap CIs for the estimate from each hospital.
    foreach hospNum of numlist 1/5 {
        replace hospNum = `hospNum' in `row'
        display as text _n "For hospital number " `hospNum' " ..."
        bootstrap r(PEratio), reps(10) seed(448864) cluster(hospital_encoded) idcluster(hospitals) nodots: bsprog, hospNum(`hospNum')     
        replace ratio      = r(table)[1,1] in `row'
        replace paramLCL   = r(table)[5,1] in `row'
        replace paramUCL   = r(table)[6,1] in `row'        
        matrix list e(ci_percentile)                                                                        
        replace percentLCL = e(ci_percentile)[1,1] in `row'
        replace percentUCL = e(ci_percentile)[2,1] in `row'        
        local row = `row' + 1
    }
    local row = `row' - 1
    order hospNum ratio param* percent*, last
    list hospNum ratio param* percent* in 1/`row'