Dear all,

This message was initially posted in the discussion thread
HTML Code:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1307980-matchit-command-to-match-two-datasets-based-on-similar-text-pattern,
, but was advised to post as a new post, with a title better matching my question, so here we go!

In most of the string similarity discussions on Statalist, users are trying to find similarities between variables. I however, would like to get a similarity score for observations within the same string variable. My data set contains more than 10000 person records and most likely there will be hundreds of people that occur in the data set multiple times, but with slightly different spelled names.

Do you have any experience with checking for string similarity within the same variable and may I ask what package you decided using in the end?

Thank you for sharing your experience!

Best wishes,

Moniek