Hi there,

I am trying to create a variable u with 1000 observations that is distributed according to a mixture of normals: with probability 0.5 ui drawn from N (−5, 3), and with probability 0.5 it is drawn from N (5, 1). I am very new to Stata, so have been attempting the following code, but there is likely a much easier way to do this.


set seed 9876
set obs 1000

gen e1 = rnormal(-5, 3)

gen e2 = rnormal(5, 1)

gen p = runiform()

if p>0.5 {
generate u=e1
label variable u "ue1"
}
else if p<0.5 {
generate u=e2
label variable u "ue2"
}

The problem is that whatever the first value of p is, that is what the entire list of u becomes. For example, the first observation of p is <0.5, so my entire set of u are from the e2 distribution, as seen below


+--------------------------------+
| p e2 u |
|--------------------------------|
1. | .3038384 5.882499 5.882499 |
2. | .0229448 5.948315 5.948315 |
3. | .7245733 5.705297 5.705297 |
4. | .1281148 4.831472 4.831472 |
5. | .8212033 4.907266 4.907266 |
|--------------------------------|
6. | .3516393 6.40349 6.40349 |
7. | .6429143 4.464439 4.464439 |
8. | .522856 3.017771 3.017771 |
9. | .1444471 4.213279 4.213279 |
10. | .1331125 5.73031 5.73031 |
|--------------------------------|
11. | .0952627 3.310728 3.310728 |
12. | .8458152 5.74537 5.74537 |
13. | .1512675 2.610965 2.610965 |
14. | .7207158 4.76138 4.76138 |
15. | .4228746 4.465273 4.465273 |


I am not sure if I am supposed to run this as 1000 simulations of a randomly generated p, or if there is a much simpler way of getting a bimodal distribution on Stata. Any help is incredibly appreciated.
Thanks
Mike