Dataset 1
SOME_KIND_OF_NAME |
THESQUIRL WAS YELLOW AND SMOOTH |
THESQUIRL WAS YELLOW SUNSHINE AND SMOOTH |
THESQUIRLWASPURPLE |
BLUE MUFFINS ARE-AWESOME |
BLUE-RAY MUFFINS ARE |
Dataset 2 –look up table
COLORS |
GREEN |
PURPLE |
YELLOW SUNSHINE |
BLUE-RAY |
use "DIRECTORY-dataset1 ", clear
matchit SAMPLE_ID SOME_KIND_OF_NAME using "directory-dataset2 ", idu(ID) txtu(colors) sim(token) t(0)
MATCH
THESQUIRL WAS YELLOW AND SMOOTH | YELLOW SUNSHINE > wrong (I only want it to match if it contains exactly YELLOW SUNSHINE, the words together in the long string) |
THESQUIRL WAS YELLOW SUNSHINE AND SMOOTH | YELLOW SUNSHINE |
THESQUIRLWASPURPLE | PURPLE |
BLUE MUFFINS ARE-AWESOME | BLUE-RAY >wrong (I only want it to match if it contains exactly BLUE-RAY, the words together in the long string) |
BLUE-RAY MUFFINS ARE | BLUE-RAY |
Thank you!
0 Response to fuzzy match
Post a Comment