Cargando…

Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage

Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of re...

Descripción completa

Detalles Bibliográficos
Autores principales: Ash, Stephen M., Ip-Lin, King
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4525218/
https://www.ncbi.nlm.nih.gov/pubmed/26306279
Descripción
Sumario:Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of record linkage with edit distance based string similarity measurements. String similarity approaches ignore the rich semantic information present by reducing it to a simple syntactic comparison of characters. This work describes an updated approach to building clinical medical record linkage systems, which embraces the unavoidable problems present in real-world patient matching. Using a ground truth dataset of a real patient population, we demonstrate that systems built in this fashion improve recall by 76% with little reduction in precision. This result empirically demonstrates the size of the gap between sophisticated systems and naïve approaches. Additionally, it accentuates the difficulty in estimating the false negative error in this setting as previous research has reported much higher levels of recall, due, in part, to measuring from biased samples.