Hi all, I’m using Stata 15.

I’m doing an analysis of applicants for grants. I have a graph showing, for each year, what % of people who submitted at least one application are Male, % who are Female, and % who are Unknown gender. On the graph, for a given year, the % Male + % Female + % Unknown = 100%. Graph is attached here. (Note, the graph shown here is just an example, not the actual data). A couple of questions:
  • is there any benefit to adding 95% confidence intervals around each of the proportions, and placing them on this graph? For example, in 2003 show the interval around the 50% (% of male), 30% (% of female), and 20% (% unknown). What am I getting from this? For example, can I use it to say the % of males was “statistically significantly higher” in 2007 than in 2003 if the confidence intervals around the % males in 2007 and % males in 2003 do not overlap? Or is it useful to simply know the uncertainty around the proportion?
  • In reading about confidence intervals, the examples I’ve found are based on surveys. Here, I have a population, as it is everyone who applied for the grant. Thus, are these confidence intervals appropriate?
  • There are some of the same people in different years. Does this affect the confidence interval calculation?
  • It seems that Stata can calculate both the logit 95% confidence interval and the binomial confidence intervals when it comes to proportions. How to know which one is best to use?

Any advice would be helpful.

Many thanks!

MJ



Array