Hi,

I have two really large datasets (5,00,000 observations each) and I have been using the command "matchit id allNames using "xyz.dta", idusing(familyid) txtusing(allNamesFamily)" to match names between these two datasets.

However, matchit is taking a really really long time to carry out the fuzzy match (almost 24 hours). I have decided to run the same command but on smaller groups now however I am not sure how to create a loop function for it.

Essentially,

I want STATA to create district-level groups, carry out the above matchit command for each group, and save all the results together. The idea is that matchit only has to look for searches within each of these district-level sub-groups and not the whole dataset.

Can anyone help me with writing such a loop?