Using frequency weights on graph bar to produce weighted averages.

Hello. I am working with a dataset that has the number of births and number of preterm births in different facilities in different districts. I want to show each district's preterm birth rate on an hbar graph. The first way (and more intuitive way, to me) I tried this was to just create variables pretermtotal and totaldels as the total of preterm and total births, respectively, and then divide them to get the district preterm birth rate.

I then thought, maybe it's more efficient to create one preterm birth variable, which in my code is pretermrate2 - the preterm birth rate for each individual observation - and then graph the mean of pretermrate2 using each observation's total deliveries as the frequency weight, which would in effect give me a weighted average. If it works I could cut out two lines of code and create fewer new variables.

The problem is, when I run both versions of this code, on the final graph I get preterm birth numbers that are slightly different. In most cases they are off by between .05-.5, and in only one case is the number the same. I suspect this problem lies in the ado file for Stata weights, but I'm really not sure how to find out if that's true, and running this method on different data gave the same numbers for both graphs.

If anyone knows why one kind of code produces different numbers than the other, I would greatly appreciate it!
* note - in the sample code I gave, the final graphed averages are a bit farther apart than when using the full unedited dataset

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str10 district int(pregpreterm delinst)

* Example generated by -dataex-. To install: ssc install dataex
clear
input str10 district int(pregpreterm delinst)
"York" 10 158
"York" 18 272
"York" 11 155
"York" 13 153
"York" 12 206
"York" 14 321
"York" 12 215
"York" 12 222
"York" 14 194
"York" 18 208
"Jersey" 15 220
"Jersey" 18 299
"Jersey" 12 146
"Jersey" 16 175
"Jersey" 10 181
"Jersey" 13 179
"Jersey" 12 175
"Jersey" 15 274
"Jersey" 17 189
"Jersey"  9 160
"Jersey" 12 139
"Jersey" 16 210
"Jersey" 14 171
"Jersey" 14 207
"Jersey"  . 114
"Jersey"  .  84
"Jersey"  .  69
"Jersey"  .  88
"Jersey"  .  75
"Guernsey"  1  89
"Guernsey"  1 138
"Guernsey"  .  55
"Guernsey"  .  96
"Guernsey"  .  59
"Guernsey"  . 102
"Guernsey"  .  66
"Guernsey"  1  76
"Guernsey"  1  92
"Guernsey"  1 114
"Guernsey"  .  67
"Guernsey"  1  72
"Guernsey"  1 103
"Guernsey"  .  44
"Guernsey"  . 122
"Guernsey"  . 117
"Guernsey"  . 135
"Guernsey"  .  57
"Guernsey"  1  73
"Mersey"  .  35
"Mersey"  .  59
"Mersey"  .  31
"Mersey"  1  37
"Mersey"  .  46
"Mersey"  .  37
"Mersey"  1  32
"Mersey"  1  37
"Mersey"  .  46
"Mersey"  .  40
"Mersey"  .  35
"Mersey"  .  48
"Mersey"  .  34
"Mersey"  .  53
"Mersey"  .  50
"Mersey"  .  44
"Mersey"  .  35
"Mersey"  .  52
"Mersey"  1  41
"Mersey"  1  21
"Mersey"  .  32
"Mersey"  1  41
"Mersey"  .  56
"Mersey"  .  20
"Mersey"  .  94
"Mersey"  5 145
"Mersey"  . 117
"Mersey"  5 107
"Mersey"  0  83
"Mersey"  . 106
"Mersey"  2  78
"Mersey"  3  83
"Mersey"  2 101
"Mersey"  3 152
"Percy"  2 102
"Percy"  0  61
"Percy"  0 152
"Percy"  5 192
"Percy"  5  95
"Percy"  .  97
"Percy"  5 103
"Percy"  3 132
"Percy"  3  67
"Percy"  3  64
"Percy"  3  65
"Percy"  . 128
"Percy"  5 138
"Percy"  5  92
"Percy"  0  45
"Percy"  .  40
"Percy"  .  49
"Percy"  .  53
end

        tempfile g1
        tempfile g2
        
        bys district: egen pretermtotal=total(pregpreterm), missing
        bys district: egen totaldels=total(delinst), missing
        gen pretermrate1=100*pretermtotal/totaldels
        sum delinst, d
        local myn `r(sum)'
        graph hbar pretermrate1, over(district) blabel(bar, ///
            size(small)) title("Preterm birth rate by district") ytitle("rate of preterm births") ///
            note("Source: MP HMIS data for FY '17-'18 and '18-'19, n = `:di %-12.0fc `myn''") ///
            bargap(40) saving(`g1', replace)
        
        gen pretermrate2=100* pregpreterm/delinst
        graph hbar (mean) pretermrate2 [fw=delinst], over(district) blabel(bar, ///
            size(small)) title("Preterm birth rate by district") ytitle("rate of preterm births") ///
            note("Source: MP HMIS data for FY '17-'18 and '18-'19, n = `:di %-12.0fc `myn''") ///
            bargap(40) saving(`g2', replace)
        
        graph combine "`g1'" "`g2'" // compare output from both methods

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Using frequency weights on graph bar to produce weighted averages.
Using frequency weights on graph bar to produce weighted averages.

0 Response to Using frequency weights on graph bar to produce weighted averages.

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Using frequency weights on graph bar to produce weighted averages. Using frequency weights on graph bar to produce weighted averages.

Related Posts with Using frequency weights on graph bar to produce weighted averages.

0 Response to Using frequency weights on graph bar to produce weighted averages.

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Using frequency weights on graph bar to produce weighted averages.
Using frequency weights on graph bar to produce weighted averages.