Hello guys,
I am currently trying to do fuzzy matching of two "string" variables (var1 and var2) in my dataset using Levenshtein Distance (-strdist package), which seems to fit my needs.
The only problem that I am having is that I need to calculate the levenshtein distance of each observation in variable 1 with each observation of variable 2, and I am not sure how. As of now, when running strdist var1 var2, i get a pairwise calculation of levenshtein distance between observations in var1 and var2 from the same row. I was wondering if anyone might know how to best implement it?
Best,
Fredrick
Related Posts with Levenshtein Distance (fuzzy matching) with a loop
Robreg mm - sDear I have some trouble with robreg mm or robreg s. When I run the regression I have the error mes…
Multiple missing characteristics - Using egen to count number of members in householdI wish to count the number of members in a household for the given survey data. hh_id memgndr_1 m…
Nonlinear decomposition with interaction effectHi all, I am studying the effect of child health on educational outcomes. My IV is the likelihood of…
Predict commanddear all, I use stata 13. When I do "predict r, rstudent", "predict c, cooksd", etc, I have the foll…
Past Stata "platform names" for "package files" (help usersite)In Stata 16, the output of help usersite tells us about the "g" lines in the package file. g sp…
Subscribe to:
Post Comments (Atom)
0 Response to Levenshtein Distance (fuzzy matching) with a loop
Post a Comment