Hello,

I have dataset which has 14 variables relating ethnic group: eth1 eth2 eth3...eth14.

The values are probabilities between 0-1 and an individual observation in the dataset can have a value in more than one ethnicity , for example, 0.5 for eth1, 0.25 for eth10 and 0.25 for eth14, and all the other ethnic groups would be 0. This is an oversimplication - but gives an idea of how the data is arranged.,

Now for some observations, around 60%, I know the actual ethnicity, not the modelled probability, and I want to overwrite the model probabilities with a 1 where I know the ethnicity, and then set all the other ethnicities to 0. So in the above example, eth1 would be 1 and eth10 and eth14 and all the other ethnic groups would be 0. The individual, actual ethnicities are contained in a variable called Ethnic_Group.

Having created a copy of the 14 eth variables I have overwritten them as follows:

replace eth1= 1 if Ethnic_Group=="A"
replace eth1 = 1 if Ethnic_Group=="a"

having done this for each eth, from eth1 to eth14, I now want to write some code so as to set all the other eth variables to 0 if one of them is 1:

for each var of varlist eth* {

replace `var' = 0 if `var' <1

}

BUT, only doing the above, if one of the variables is indeed 1, if none of the eth variables are 1, then make no changes to them.

I am not sure how to amend my code to take into account, that I don't want to replace the values with 0, if none of the eth variables are 1.