After splitting a variable that had options seperated by commas, how to find out which is the highest and the lowest frequency of values?

My data looks like this

fruit_variable has values
1. Mango
2. Mango, pineapple
3. Mango, grapes
4. Banana, grapes, chickoo
5. strawberry , mango, orange
7. orange, banana , mango

I want to know which fruit is produced the most to least. It is not possible to know that in such type of data seperated by commas.
So, I used the split command and seperated every option seperated by commas into different variable like the following

fruit_variable1 fruit variable2 fruit variable3 fruit variable4
mango
mango Pineapple
mango Grapes
Banana Grapes Chickoo
Strawberry Mango orange
Orange Banana Mango

Now I created new variables after every fruit---- fruit_mango, fruit_pineapple and so on.
I did
gen fruit_mango=0 if fruit_variable1=="Mango"
replace fruit_mango=0 if fruit_variable2=="Mango"
fruit_mango=0 if fruit_variable3=="Mango"
and so on

But, I don't think, this is smart coding. Can it be done any other way?

is there a command to include all the fruit_variable_n(series) variables together instead of writing it individually every time? Doesn't " * " symbol denote the "n"(value that a variable takes after splitting)?
How do I do this?

BJ Data Tech Solution

0 Response to After splitting a variable that had options seperated by commas, how to find out which is the highest and the lowest frequency of values?

Post a Comment

Related Posts with After splitting a variable that had options seperated by commas, how to find out which is the highest and the lowest frequency of values?

0 Response to After splitting a variable that had options seperated by commas, how to find out which is the highest and the lowest frequency of values?

Post a Comment