BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

How to create a loop function to implement fuzzy matching on a large dataset?
How to create a loop function to implement fuzzy matching on a large dataset?

Hi,

I have two really large datasets (5,00,000 observations each) and I have been using the command "matchit id allNames using "xyz.dta", idusing(familyid) txtusing(allNamesFamily)" to match names between these two datasets.

However, matchit is taking a really really long time to carry out the fuzzy match (almost 24 hours). I have decided to run the same command but on smaller groups now however I am not sure how to create a loop function for it.

Essentially,

I want STATA to create district-level groups, carry out the above matchit command for each group, and save all the results together. The idea is that matchit only has to look for searches within each of these district-level sub-groups and not the whole dataset.

Can anyone help me with writing such a loop?

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / How to create a loop function to implement fuzzy matching on a large dataset?
How to create a loop function to implement fuzzy matching on a large dataset?

0 Response to How to create a loop function to implement fuzzy matching on a large dataset?

Post a Comment

Home / Data Cleaning / Data management / Data Processing / How to create a loop function to implement fuzzy matching on a large dataset? How to create a loop function to implement fuzzy matching on a large dataset?

Related Posts with How to create a loop function to implement fuzzy matching on a large dataset?

0 Response to How to create a loop function to implement fuzzy matching on a large dataset?