I'm hoping this is something someone is willing and able to explain to me, since I've failed at figuring it out on my own. I need to sort estimated means, into descending order, per year. I'm using Gallup World Poll, and the estimated means are of an index, by country per year, and respondent level sampling weights need to be a part of the computation of the means to account for survey design, etc.. Since I suspect sharing a dataex of that would make my question more complicated, the following mockup shows the situation and desired outcome. Please see here:
I start with this dataset:
Code:
clear all webuse highschool, clear svyset [pweight=sampwgt] * making a fake year variable set seed 12345 generate rannum = uniform() sort rannum generate year = . lab var year "grad year" drop rannum replace year = 2009 in 1/999 replace year = 2010 in 1000/1999 replace year = 2011 in 2000/2999 replace year = 2012 in 3000/4071 * making a fake outcome variable set seed 54321 generate rannum = uniform() sort rannum generate happy = . lab var happy "happiness index" drop rannum replace happy = 1 in 1/700 replace happy = 2 in 701/2200 replace happy = 3 in 2201/4071 label define happy 1 "unhappy" 2 "neutral" 3 "happy" label values happy happy codebook, compact
Code:
*attempt 1 (using svy:mean with subpop for the if statement restricting year)
svy, subpop(if year==2009): mean happy, over(race) coeflegend
Code:
(running mean on estimation sample) Survey: Mean estimation Number of strata = 1 Number of obs = 4,071 Number of PSUs = 4,071 Population size = 8,000,000 Subpop. no. obs = 999 Subpop. size = 2,016,463 Design df = 4,070 ------------------------------------------------------------------------------ | Mean Legend -------------+---------------------------------------------------------------- c.happy@race | White | 2.325196 _b[c.happy@1bn.race] Black | 2.207406 _b[c.happy@2.race] Other | 2.30351 _b[c.happy@3.race] ------------------------------------------------------------------------------
Code:
*attempt 2 (using weights with arithmetic and stock Stata)
sort race year
by race year: gen meanHappy = sum(happy* sampwgt) / sum(sampwgt)
by race year: replace meanHappy=meanHappy[_N]
tabstat meanHappy if year==2009, statistics(mean) by(race) columns(statistics)
Code:
Summary for variables: meanHappy by categories of: race (1=white, 2=black, 3=other) race | mean -------+---------- White | 2.325196 Black | 2.207406 Other | 2.303509 -------+---------- Total | 2.312006 ------------------
What I need to somehow produce (using this silly demo example):
Code:
* a tabulation that sorts these means in descending order
White | 2.325196
Other | 2.303509
Black | 2.207406
Cheers,
Erika
Editing to add: I also will need to include a measure of precision of the estimates of these means - i.e. standard error of the mean. Haven't looked at that yet, since I'm stuck on this sorting task, but that's next and probably related.
0 Response to svy: mean and sorting estimated means
Post a Comment