Item Details
Skip Navigation Links
   ActiveUsers:634Hits:20376413Skip Navigation Links
Show My Basket
Contact Us
IDSA Web Site
Ask Us
Today's News
HelpExpand Help
Advanced search

In Basket
  Journal Article   Journal Article
 

ID165416
Title ProperUsing a probabilistic model to assist merging of large-scale administrative records
LanguageENG
AuthorENAMORADO, TED
Summary / Abstract (Note)Since most social science research relies on multiple data sources, merging data sets is an essential part of researchers’ workflow. Unfortunately, a unique identifier that unambiguously links records is often unavailable, and data may contain missing and inaccurate information. These problems are severe especially when merging large-scale administrative records. We develop a fast and scalable algorithm to implement a canonical model of probabilistic record linkage that has many advantages over deterministic methods frequently used by social scientists. The proposed methodology efficiently handles millions of observations while accounting for missing data and measurement error, incorporating auxiliary information, and adjusting for uncertainty about merging in post-merge analyses. We conduct comprehensive simulation studies to evaluate the performance of our algorithm in realistic scenarios. We also apply our methodology to merging campaign contribution records, survey data, and nationwide voter files. An open-source software package is available for implementing the proposed methodology.
`In' analytical NoteAmerican Political Science Review Vol. 113, No.2; May 2019: p.353-371
Journal SourceAmerican Political Science Review 2019-06 113, 2
Key WordsProbabilistic model ;  Large-Scale Administrative Records