Hello Statalist,


I have survey data that collected individual level use of a bed net against mosquitoes. Individuals residing in districts were randomly sampled and asked whether they used a net the night before (Y/N). These individuals came from 76 districts which are in turn nested within 14 regions. Furthermore, there were 5 annual rounds of this survey. In the example data below, data from 2016 are presented (first 100 out of 3000 overall).

My objective is to combine district-level and region-level data in a way that allows me to obtain estimates of district-specific rates that minimize the error across all districts within the same region. When there are fewer data in a district, I would like to rely more heavily on the regional data.

I was thinking that I could achieve this using a mixed logistic regression model. In this approach, I think I can model the observed outcome in each district as a function of fixed effects for both region and year and with random effects for district. I suppose that estimates for each district would then be shrunk toward the regional average, with districts having more observations shrunk less.

I am just not sure of how to best implement this model in Stata.
Code:
clear
input str12 id int year byte(region district netuse)
"        21" 2016 1  9 0
"        421" 2016 1 10 0
"        51" 2016 1  9 1
"        59" 2016 1  9 1
"        51" 2016 1  9 1
"        52" 2016 1  9 0
"        522" 2016 1  9 0
"        73" 2016 1  9 1
"        723" 2016 1  9 0
"        76" 2016 1  9 0
"        716" 2016 1  9 1
"        722" 2016 1  9 1
"        8 20" 2016 1 31 0
"        9  2" 2016 1 31 0
"        9  2" 2016 1 31 0
"        9  5" 2016 1 31 0
"        9 15" 2016 1 31 1
"        9 22" 2016 1 31 0
"       11 17" 2016 1 55 0
"       11 18" 2016 1 55 0
"       12 18" 2016 1 45 1
"       12 22" 2016 1 45 1
"       13 13" 2016 1 13 0
"       14 21" 2016 1 61 0
"       14 21" 2016 1 61 0
"       16  2" 2016 1 61 0
"       17 17" 2016 1 61 1
"       17 18" 2016 1 61 1
"       17 18" 2016 1 61 1
"       18  2" 2016 1 61 0
"       19  2" 2016 1 23 0
"       19  2" 2016 1 23 0
"       19  6" 2016 1 23 0
"       19  7" 2016 1 23 0
"       19  7" 2016 1 23 0
"       19 17" 2016 1 23 0
"       22  5" 2016 2 72 1
"       22  9" 2016 2 72 0
"       22 14" 2016 2 72 1
"       23 10" 2016 2  3 1
"       24  4" 2016 2 16 1
"       24  9" 2016 2 16 1
"       24  9" 2016 2 16 1
"       24 17" 2016 2 16 1
"       25 16" 2016 2 16 1
"       25 17" 2016 2 16 0
"       25 17" 2016 2 16 0
"       25 20" 2016 2 16 0
"       25 22" 2016 2 16 0
"       26  3" 2016 2 52 1
"       26  6" 2016 2 52 0
"       26  8" 2016 2 52 1
"       29 15" 2016 2 76 1
"       29 20" 2016 2 76 0
"       30  3" 2016 2 76 0
"       30  6" 2016 2 76 0
"       30 11" 2016 2 76 0
"       30 15" 2016 2 76 0
"       31  5" 2016 2 76 1
"       31  8" 2016 2 76 0
"       31  8" 2016 2 76 1
"       31 10" 2016 2 76 1
"       32  3" 2016 2 76 0
"       32  6" 2016 2 76 0
"       32 10" 2016 2 76 0
"       32 12" 2016 2 76 0
"       32 12" 2016 2 76 0
"       32 14" 2016 2 76 1
"       32 14" 2016 2 76 1
"       33 12" 2016 2 76 1
"       33 13" 2016 2 76 1
"       33 13" 2016 2 76 1
"       33 17" 2016 2 76 0
"       33 22" 2016 2 76 1
"       34  9" 2016 2 76 1
"       34 15" 2016 2 76 0
"       34 18" 2016 2 76 0
"       35 17" 2016 3  2 1
"       36  2" 2016 3 17 0
"       36  4" 2016 3 17 1
"       36 17" 2016 3 17 0
"       36 18" 2016 3 17 0
"       37 12" 2016 3 17 1
"       37 20" 2016 3 17 0
"       40  8" 2016 3 17 0
"       40 16" 2016 3 17 1
"       40 18" 2016 3 17 0
"       40 19" 2016 3 17 0
"       40 22" 2016 3 17 0
"       41  3" 2016 3 74 1
"       41  7" 2016 3 74 0
"       41 17" 2016 3 74 0
"       42  3" 2016 3 74 0
"       43  1" 2016 3 44 0
"       43 16" 2016 3 44 1
"       43 18" 2016 3 44 0
"       44 11" 2016 3 74 0
"       44 14" 2016 3 74 0
"       44 20" 2016 3 74 0
"       45 10" 2016 3 74 0
end
label values region region
label def region 1 "dakar", modify
label def region 2 "ziguinchor", modify
label def region 3 "diourbel", modify
label values district district_id
label def district_id 2 "Bambey", modify
label def district_id 3 "Bignona", modify
label def district_id 9 "Dakar-nord", modify
label def district_id 10 "Dakar-ouest", modify
label def district_id 13 "Diameniadio", modify
label def district_id 16 "Diouloulou", modify
label def district_id 17 "Diourbel", modify
label def district_id 23 "Guédiawaye", modify
label def district_id 31 "Keur Massar", modify
label def district_id 44 "Mbacké", modify
label def district_id 45 "Mbao", modify
label def district_id 52 "Oussouye", modify
label def district_id 55 "Pikine", modify
label def district_id 61 "Rufisque", modify
label def district_id 72 "Thionck-Essyl", modify
label def district_id 74 "Touba", modify
label def district_id 76 "Ziguinchor", modify


melogit netuse year i.region || district:
melogit netuse i.year i.region || district:
melogit netuse year i.region || district: year
I am just not sure which of the models above would be most appropriate or if a different model would be more appropriate. I was wondering if someone with experience with these kinds of models would be able to comment on whether any of the above computations would give me what I want.

Thank you