BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Identifying pairs of strings that differ by only two consecutive, inverted characters
Identifying pairs of strings that differ by only two consecutive, inverted characters

I want to identify when two putatively matched surnames (stored as two string variables) differ only by the switching of two consecutive letters. For example, to flag someone who has "CARSLAKE" in String1 and "CARLSAKE" in String2. I don't want to accept other pairs with a Levenshtein distance of 2 as they look more like distinct names (not typos).

I can imagine something looping through each letter in turn using substr(), but this would be very long-winded and clunky since the surnames of course vary in length between pairs (within pairs, I'm only interested if they're the same length). Does anyone know of a more sensible solution? Thanks.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Identifying pairs of strings that differ by only two consecutive, inverted characters
Identifying pairs of strings that differ by only two consecutive, inverted characters

0 Response to Identifying pairs of strings that differ by only two consecutive, inverted characters

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Identifying pairs of strings that differ by only two consecutive, inverted characters Identifying pairs of strings that differ by only two consecutive, inverted characters

Related Posts with Identifying pairs of strings that differ by only two consecutive, inverted characters

0 Response to Identifying pairs of strings that differ by only two consecutive, inverted characters