I’m doing an analysis of applications and selections for grants. I do not have a way to identify the unique person in the data (and a person can apply for multiple grants), so any analysis has to be done at the level of the application. I show an example table (made up data). It shows for each year, of all applications, what percentage came from minorities, what percentage came from whites, and what % form race unknown. (note I have an equivalent table of % selections that I would also want to do the equivalent)
% applications | ||||
% Minority | % White | % Race Unknown | Total | |
2011 | 56% | 23% | 21% | 100% |
2012 | 55% | 24% | 21% | 100% |
2013 | 45% | 25% | 30% | 100% |
2014 | 35% | 23% | 42% | 100% |
2015 | 34% | 27% | 39% | 100% |
2016 | 40% | 29% | 31% | 100% |
2017 | 32% | 32% | 36% | 100% |
I was asked to provide some kind of “margins of error” around the estimates.
I thought about adding confidence intervals around each proportion, which seems to be easily obtained in Stata using proportion gender, over(year) (shows the proportion, std. Err, and logit 95% conf. Interval)
My question is -
-
- Is it a good idea to show the confidence intervals around every one of these proportions? What am I getting from this? For example, can I use it to say the % of whites was “statistically significantly higher” in 2017 than in 2011 if the confidence intervals do not overlap?
- In reading about confidence intervals, the examples I’ve found are based on surveys. Here, I have a population, as it is everyone who applied for the grant. Thus, are these confidence intervals appropriate?
- Is it a major problem to do be showing confidence intervals on the level of the application and not the unique person?
Note: I am using Stata 15
Any advice would be helpful!
MJ
0 Response to confidence intervals around proportion on multilevel data
Post a Comment