My data looks like this
fruit_variable has values
1. Mango
2. Mango, pineapple
3. Mango, grapes
4. Banana, grapes, chickoo
5. strawberry , mango, orange
7. orange, banana , mango
I want to know which fruit is produced the most to least. It is not possible to know that in such type of data seperated by commas.
So, I used the split command and seperated every option seperated by commas into different variable like the following
fruit_variable1 fruit variable2 fruit variable3 fruit variable4
mango
mango Pineapple
mango Grapes
Banana Grapes Chickoo
Strawberry Mango orange
Orange Banana Mango
Now I created new variables after every fruit---- fruit_mango, fruit_pineapple and so on.
I did
gen fruit_mango=0 if fruit_variable1=="Mango"
replace fruit_mango=0 if fruit_variable2=="Mango"
fruit_mango=0 if fruit_variable3=="Mango"
and so on
But, I don't think, this is smart coding. Can it be done any other way?
is there a command to include all the fruit_variable_n(series) variables together instead of writing it individually every time? Doesn't " * " symbol denote the "n"(value that a variable takes after splitting)?
How do I do this?
0 Response to After splitting a variable that had options seperated by commas, how to find out which is the highest and the lowest frequency of values?
Post a Comment