Dear Statalist,

I have observations in the intervall [0,1] with two modes at 0 and 1. I want to model this data in the following way
1. as a mixture of a Bernoulli random variable and a beta random variable.
2. depending on an unobservable state variable.
For the second part it seems that an finite mixture model is the right command, but I have not figured out yet how to combine it.

In the following I provide a simulated dataset where the „true“ values are shown and the aim is to obtain these values (state probability, probabilities of belonging to the discrete or continuous part and distribution parameters) via a finite mixture model or another modelling approach.

Code:
drop _all
clear
set seed 12345
set obs 10000

gen state = rbinomial(1, 0.5)
// Indicator to which mixture distribution an observation belongs to
// probability for state1 and state2 is 0.5 in this case

gen binom1 = rbinomial(1, 0.7)
gen beta1 = rbeta(5,2)
gen p1 = rbinomial(1,0.4)
// In state 0: probability for 0 is 0.4*0.3, probability for 1 is 0.4*0.7
// probability for a draw from the continous part is 0.6

gen binom2 = rbinomial(1, 0.3)
gen beta2 = rbeta(2,5)
gen p2 = rbinomial(1,0.6)
// In state 1: probability for 0 is 0.6*0.7, probability for 1 is 0.6*0.3
// probability for a draw from the continous part is 0.4

gen y = .
replace y = p1*binom1+(1-p1)*beta1 if state == 0
replace y = p2*binom2+(1-p2)*beta2 if state == 1
// y is a mixed random variable (mixture of a Bernoulli random variable and a beta random variable

hist y

// I want to estimate:
//1. state probability (should be 0.5)
//2. probability of an observation being 0, 1 or a draw from the continious part
//3. shape parameters a,b for the two beta distributions
This methodology is used in "Calabrese (2014) - Downturn Loss Given Default" and I want to adapt it.

Any help is appreciated.

Kind regards

Steffen