FIND DATA: By Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | IR | Law & Courts🎵
   FIND DATA: By Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts🎵
WHAT'S NEW? US Politics | IR | Law & Courts🎵
If this link is broken, please report as broken. You can also submit updates (will be reviewed).

How Adaptive Fuzzy Matching Fixes Messy Single-Field Data Merges

record linkagefuzzy matchingadaptive learningentity resolutiondata mergingMethodology@Pol. An.4 R files26 datasetsDataverse
Methodology subfield banner

🔍 The Problem: Combining data from multiple sources is routine, but matching records becomes difficult when datasets lack a shared unique identifier. Fields like names, addresses, and phone numbers are often entered incorrectly, missing, or formatted differently. While progress has been made when many matching fields exist, the much more uncertain case of only one identifying field — the fuzzy string matching problem — remains largely unsolved.

🧠 The Approach — An Adaptive Algorithm for Single-Field Matches: An algorithmic solution called Adaptive Fuzzy String Matching is designed and validated. The approach is rooted in adaptive learning and tailored to the challenges of matching on a single, messy identifying field rather than relying on multiple corroborating fields.

📊 Key Findings:

  • The Adaptive Fuzzy String Matching tool identifies more true matches than existing solutions.
  • It achieves higher precision compared with standard fuzzy-matching approaches.
  • Validation demonstrates robustness across different kinds of entities where only one identifier is available.

📚 Applications Tested:

  • Matching organizations
  • Matching geographic places
  • Matching individuals

These applications illustrate both validity and practical value when unique identifiers are absent.

Why It Matters: This method addresses a common and consequential gap in record linkage: reliably merging datasets when only one noisy identifier is available. Improving match rates and precision in this setting makes multi-source empirical analysis more feasible and trustworthy across many political science and policy research contexts.

Article card for article: Adaptive Fuzzy String Matching: How to Merge Data Sets With Only One (Messy) Identifying Field
Adaptive Fuzzy String Matching: How to Merge Data Sets With Only One (Messy) Identifying Field was authored by Aaron Kaufman and Aja Klevs. It was published by Cambridge in Pol. An. in 2022.
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Edit article record marker