Cargando…
Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage
Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of re...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4525218/ https://www.ncbi.nlm.nih.gov/pubmed/26306279 |
_version_ | 1782384294026543104 |
---|---|
author | Ash, Stephen M. Ip-Lin, King |
author_facet | Ash, Stephen M. Ip-Lin, King |
author_sort | Ash, Stephen M. |
collection | PubMed |
description | Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of record linkage with edit distance based string similarity measurements. String similarity approaches ignore the rich semantic information present by reducing it to a simple syntactic comparison of characters. This work describes an updated approach to building clinical medical record linkage systems, which embraces the unavoidable problems present in real-world patient matching. Using a ground truth dataset of a real patient population, we demonstrate that systems built in this fashion improve recall by 76% with little reduction in precision. This result empirically demonstrates the size of the gap between sophisticated systems and naïve approaches. Additionally, it accentuates the difficulty in estimating the false negative error in this setting as previous research has reported much higher levels of recall, due, in part, to measuring from biased samples. |
format | Online Article Text |
id | pubmed-4525218 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-45252182015-08-24 Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage Ash, Stephen M. Ip-Lin, King AMIA Jt Summits Transl Sci Proc Articles Duplicate patient records in health information systems have received increased attention in recent time due to regulatory incentives to integrate the healthcare enterprise. Historically, most patient record matching systems have been limited to simple applications of the Fellegi-Sunter theory of record linkage with edit distance based string similarity measurements. String similarity approaches ignore the rich semantic information present by reducing it to a simple syntactic comparison of characters. This work describes an updated approach to building clinical medical record linkage systems, which embraces the unavoidable problems present in real-world patient matching. Using a ground truth dataset of a real patient population, we demonstrate that systems built in this fashion improve recall by 76% with little reduction in precision. This result empirically demonstrates the size of the gap between sophisticated systems and naïve approaches. Additionally, it accentuates the difficulty in estimating the false negative error in this setting as previous research has reported much higher levels of recall, due, in part, to measuring from biased samples. American Medical Informatics Association 2015-03-25 /pmc/articles/PMC4525218/ /pubmed/26306279 Text en ©2015 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Ash, Stephen M. Ip-Lin, King Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage |
title | Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage |
title_full | Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage |
title_fullStr | Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage |
title_full_unstemmed | Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage |
title_short | Embracing the Sparse, Noisy, and Interrelated Aspects of Patient Demographics for use in Clinical Medical Record Linkage |
title_sort | embracing the sparse, noisy, and interrelated aspects of patient demographics for use in clinical medical record linkage |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4525218/ https://www.ncbi.nlm.nih.gov/pubmed/26306279 |
work_keys_str_mv | AT ashstephenm embracingthesparsenoisyandinterrelatedaspectsofpatientdemographicsforuseinclinicalmedicalrecordlinkage AT iplinking embracingthesparsenoisyandinterrelatedaspectsofpatientdemographicsforuseinclinicalmedicalrecordlinkage |