Does anyone know how reclink chooses which potential matches to report? Does it effectively sort the potential matches by similarity score and start with the match with the highest score (greedy algorithm)? Or does it use some sort of optimal algorithm? Or something else?
Similarly, for people who use matchit, how do you choose which potential matches to use when doing a 1:1 fuzzy match of two datasets?
I'm looking more for best practices than code, though I'd be interested in code that maximized the total similarity score if anyone had such a thing.
Thank you,
Kramer
Related Posts with Fuzzy matching: choosing potential matches (reclink/matchit)
Margins after MixedHey all, I'm running a 2-level model using the - mixed - command. Can I use the regular - margins -…
Independent T-test for two samplesHello, I am doing a bachelor project about empathy's class efficiency at school. To do so, we gave …
Scaling a variable by another variableHi everyone, I need to calculate a variable as the standard deviation of another variable scaled by…
Interaction effect outliers?I am very new to STATA and doing research. Something I do not quite understand. I am looking into a…
IV Tobit Type-2 CommandLong time reader, first-time poster! I am analysing a data set where my dependent variable (expense…
Subscribe to:
Post Comments (Atom)
0 Response to Fuzzy matching: choosing potential matches (reclink/matchit)
Post a Comment