Hello,
I have dataset which has 14 variables relating ethnic group: eth1 eth2 eth3...eth14.
The values are probabilities between 0-1 and an individual observation in the dataset can have a value in more than one ethnicity , for example, 0.5 for eth1, 0.25 for eth10 and 0.25 for eth14, and all the other ethnic groups would be 0. This is an oversimplication - but gives an idea of how the data is arranged.,
Now for some observations, around 60%, I know the actual ethnicity, not the modelled probability, and I want to overwrite the model probabilities with a 1 where I know the ethnicity, and then set all the other ethnicities to 0. So in the above example, eth1 would be 1 and eth10 and eth14 and all the other ethnic groups would be 0. The individual, actual ethnicities are contained in a variable called Ethnic_Group.
Having created a copy of the 14 eth variables I have overwritten them as follows:
replace eth1= 1 if Ethnic_Group=="A"
replace eth1 = 1 if Ethnic_Group=="a"
having done this for each eth, from eth1 to eth14, I now want to write some code so as to set all the other eth variables to 0 if one of them is 1:
for each var of varlist eth* {
replace `var' = 0 if `var' <1
}
BUT, only doing the above, if one of the variables is indeed 1, if none of the eth variables are 1, then make no changes to them.
I am not sure how to amend my code to take into account, that I don't want to replace the values with 0, if none of the eth variables are 1.
0 Response to replacing variable value via a loop conditioning on the contents of other variables
Post a Comment