This message was initially posted in the discussion thread
HTML Code:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1307980-matchit-command-to-match-two-datasets-based-on-similar-text-pattern,
In most of the string similarity discussions on Statalist, users are trying to find similarities between variables. I however, would like to get a similarity score for observations within the same string variable. My data set contains more than 10000 person records and most likely there will be hundreds of people that occur in the data set multiple times, but with slightly different spelled names.
Do you have any experience with checking for string similarity within the same variable and may I ask what package you decided using in the end?
Thank you for sharing your experience!
Best wishes,
Moniek
0 Response to checking string similarity within the same variable
Post a Comment