I am thinking of fitting a fractional response or beta regression for Y on X, for a dataset which looks something like:
Identifier | Numerator | Denominator | Y | X1 | X2 |
1 | 1000 | 2000 | 0.5 | 1 | 1 |
2 | 2000 | 3000 | 0.666667 | 2 | 3 |
3 | 3000 | 4000 | 0.75 | 3 | 5 |
4 | 4000 | 5000 | 0.8 | 4 | 7 |
5 | 5000 | 6000 | 0.833333 | 5 | 9 |
6 | 6000 | 7000 | 0.857143 | 6 | 11 |
7 | 7000 | 8000 | 0.875 | 7 | 13 |
8 | 8000 | 9000 | 0.888889 | 8 | 15 |
9 | 9000 | 10000 | 0.9 | 9 | 17 |
However, to me, it seems intuitive to assign more "weight" to observations where the sample size and/or number of events is larger. For example, it just seems to make sense that Observation 9 should contribute more weight to the analysis as it has 10000 observations and therefore should have the greatest precision / narrowest confidence intervals (if we were to compute the 95% CI of the proportion). Meaning to say that the confidence intervals around the proportion 9000/10000 is smaller than the confidence intervals around a proportion of say, 1000/2000.
In line with the above intuition, I was thinking of assigning inverse variance or inverse-SE weights to the analysis, by computing the SE of the logit of Y.
But I am also conscientious of the fact that I have not found any references to support such an approach..
I was hoping to get some expert advice if you are aware of any statistical theory / simulation papers which might support my above methodology...
Thanks in advance!
Best regards,
Nicholas Syn
0 Response to Fractional or beta regression, with weights that reflect the inverse variance of the proportion
Post a Comment