Dear Statalist community,
I am doing research using insurance claim data, where the dependent variable of interest is the loss-cost ratio, namely indemnity amount divided by the total liability. Naturally, it is a fractional variable bounded between [0,1]. However, it has excessive zeros, due to deductibles, and my understanding is these zeros are essentially censored because "zero" can mean positive actual loss. So simply put, the dependent variable is a fractional response with censored zeros. There are several alternative modeling approaches I can think of, but each of them misses certain aspects if I understand them correctly:
1. Fractional response model as in Papke and Wooldridge (1996): may not be best when the number of zero observations is large; in this case also misses the censoring nature at zeros.
2. Two-limit Tobit: misses the fractional nature of the variable; strong distributional assumptions.
3. Zero-inflated beta model as in Cook et al. (2011): does not account for the censoring nature of zeros.
4. Two-part fractional response model as in Ramalho and Ramalho (2011): due to some reasons, we want to analyze a balanced panel, but the two-part model essentially uses a subsample containing (0,1) observations in the second part which results in unbalanced data in estimation. Hence we prefer not to use this.
5. Augmenting fractional response model by modeling heteroskedasticity as in Wooldridge slides page 7: honestly I don't understand why this works, I'd appreciate it if anyone could explain; but also it doesn't reflect the censoring nature of zeros.
So my questions are:
(1) why is pproach 5 above able to account for excessive zeros?
(2) what would be the best approach to model my dependent variable described above, i.e., a fractional variable with excessive censored zeros, while estimating a balanced panel?
Besides, if I misunderstood anything, please feel free to point it out, thanks!
Much appreciated,
Zhenni
0 Response to Fractional response with censored zeros
Post a Comment