Cargando…

Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage

Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of re...

Descripción completa

Detalles Bibliográficos
Autores principales: Ash, Stephen M., Ip-Lin, King
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4525218/
https://www.ncbi.nlm.nih.gov/pubmed/26306279
_version_ 1782384294026543104
author Ash, Stephen M.
Ip-Lin, King
author_facet Ash, Stephen M.
Ip-Lin, King
author_sort Ash, Stephen M.
collection PubMed
description Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of record linkage with edit distance based string similarity measurements. String similarity approaches ignore the rich semantic information present by reducing it to a simple syntactic comparison of characters. This work describes an updated approach to building clinical medical record linkage systems, which embraces the unavoidable problems present in real-world patient matching. Using a ground truth dataset of a real patient population, we demonstrate that systems built in this fashion improve recall by 76% with little reduction in precision. This result empirically demonstrates the size of the gap between sophisticated systems and naïve approaches. Additionally, it accentuates the difficulty in estimating the false negative error in this setting as previous research has reported much higher levels of recall, due, in part, to measuring from biased samples.
format Online
Article
Text
id pubmed-4525218
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher American Medical Informatics Association
record_format MEDLINE/PubMed
spelling pubmed-45252182015-08-24 Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage Ash, Stephen M. Ip-Lin, King AMIA Jt Summits Transl Sci Proc Articles Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of record linkage with edit distance based string similarity measurements. String similarity approaches ignore the rich semantic information present by reducing it to a simple syntactic comparison of characters. This work describes an updated approach to building clinical medical record linkage systems, which embraces the unavoidable problems present in real-world patient matching. Using a ground truth dataset of a real patient population, we demonstrate that systems built in this fashion improve recall by 76% with little reduction in precision. This result empirically demonstrates the size of the gap between sophisticated systems and naïve approaches. Additionally, it accentuates the difficulty in estimating the false negative error in this setting as previous research has reported much higher levels of recall, due, in part, to measuring from biased samples. American Medical Informatics Association 2015-03-25 /pmc/articles/PMC4525218/ /pubmed/26306279 Text en ©2015 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose
spellingShingle Articles
Ash, Stephen M.
Ip-Lin, King
Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
title Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
title_full Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
title_fullStr Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
title_full_unstemmed Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
title_short Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
title_sort embracing the sparse, noisy, and interrelated aspects of patient demographics for use in clinical medical record linkage
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4525218/
https://www.ncbi.nlm.nih.gov/pubmed/26306279
work_keys_str_mv AT ashstephenm embracingthesparsenoisyandinterrelatedaspectsofpatientdemographicsforuseinclinicalmedicalrecordlinkage
AT iplinking embracingthesparsenoisyandinterrelatedaspectsofpatientdemographicsforuseinclinicalmedicalrecordlinkage