categorize a variable if certain words are found in it from a big list

Hi everyone,

I want to categorize a variable based on a list that I have maintained. This list is continuously updated and have more than 1000 unique clean and categorized businesses.
Now, I want to loop through each observation of my variable in Stata dataset and see if any words within the clean list is present. If so, that observation would be categorized according to the listed categories. And if not, move to other observation of that variable.

Manually doing this process over and over is time consuming. So, I need to find a way to code this in Stata.

Code:

merge
command won't work here since, it match exact words and not sub-strings within variables

.

I tried the following but it is time consuming and tedious manual process

Code:

foreach i in Bike Rickshaw Van {
replace category= "Transport" if regexm(business,"`i'",.)
}

Following is a snapshot of the list and data (variables):

* clean list

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str14 word str11 category
"Rickshaw"       "Transport"  
"Bike"           "Transport"  
"Stitching"      "Enterprise" 
"Livestock"      "Live Stock" 
"trading"        "Enterprise" 
"servicees"      "Enterprise" 
"housing"        "Enterprise" 
"milk selling"   "Agriculture"
"vegetable sale" "Agriculture"
"vegetable shop" "Agriculture"
end

* data to categorize based on list

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str18 business byte category
"Live Stock"         .
"Trading & Business" .
"Handi Craft"        .
"Others"             .
"Agriculture"        .
"Manufacturing"      .
"Commerce"           .
"Commucation System" .
"Stitching Work"     .
"Transport"          .
"Shoe Business"      .
"Auto part Workshop" .
"Education"          .
"Cloth selling"      .
"Animal Trading"     .
end

Thanks in advance.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / categorize a variable if certain words are found in it from a big list
categorize a variable if certain words are found in it from a big list

0 Response to categorize a variable if certain words are found in it from a big list

Post a Comment

Home / Data Cleaning / Data management / Data Processing / categorize a variable if certain words are found in it from a big list categorize a variable if certain words are found in it from a big list

Related Posts with categorize a variable if certain words are found in it from a big list

0 Response to categorize a variable if certain words are found in it from a big list

Post a Comment

Home / Data Cleaning / Data management / Data Processing / categorize a variable if certain words are found in it from a big list
categorize a variable if certain words are found in it from a big list