Hello guys,
I am currently trying to do fuzzy matching of two "string" variables (var1 and var2) in my dataset using Levenshtein Distance (-strdist package), which seems to fit my needs.
The only problem that I am having is that I need to calculate the levenshtein distance of each observation in variable 1 with each observation of variable 2, and I am not sure how. As of now, when running strdist var1 var2, i get a pairwise calculation of levenshtein distance between observations in var1 and var2 from the same row. I was wondering if anyone might know how to best implement it?
Best,
Fredrick
Related Posts with Levenshtein Distance (fuzzy matching) with a loop
outlier results interpretationDear Statalits, I hope you are well. Could you please help me on interpret the below table shows re…
Have question about winsor2 proceduresDear Statalists, I hope you are well. I would like to ask you please about the process of using the…
Problem with a counterfactual analysis of the Theil-index decompositionHi Statalist, I'm writing a paper concerning changes in in wage-inequality over time in relation to …
Number of unique firms/ industriesHi I have a panel data. Gvkey is an identifier for firms and Fyear is an identifier for year. Sic is…
ivregress and suestHi, I'm trying to use Oaxaca decomposition and within that, I'm trying to estimate the models using…
Subscribe to:
Post Comments (Atom)
0 Response to Levenshtein Distance (fuzzy matching) with a loop
Post a Comment