Cargando…

Detecting Systemic Data Quality Issues in Electronic Health Records

Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Ta, Casey N., Weng, Chunhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857180/
https://www.ncbi.nlm.nih.gov/pubmed/31437950
http://dx.doi.org/10.3233/SHTI190248
_version_ 1783470712659378176
author Ta, Casey N.
Weng, Chunhua
author_facet Ta, Casey N.
Weng, Chunhua
author_sort Ta, Casey N.
collection PubMed
description Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology for data quality assessment by plotting domain-level (conditions (diagnoses), drugs, and procedures) aggregate statistics and concept-level temporal frequencies (i.e., annual prevalence rates of clinical concepts). We detect common temporal patterns in concept frequencies by normalizing and clustering annual concept frequencies using K-means clustering. We apply these methods to the Columbia University Irving Medical Center Observational Medical Outcomes Partnership database. The resulting domain-aggregate and cluster plots show a variety of patterns. We review the patterns found in the condition domain and investigate the processes that shape them. We find that these patterns suggest data quality issues influenced by systemwide factors that affect individual concept frequencies.
format Online
Article
Text
id pubmed-6857180
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-68571802019-11-15 Detecting Systemic Data Quality Issues in Electronic Health Records Ta, Casey N. Weng, Chunhua Stud Health Technol Inform Article Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology for data quality assessment by plotting domain-level (conditions (diagnoses), drugs, and procedures) aggregate statistics and concept-level temporal frequencies (i.e., annual prevalence rates of clinical concepts). We detect common temporal patterns in concept frequencies by normalizing and clustering annual concept frequencies using K-means clustering. We apply these methods to the Columbia University Irving Medical Center Observational Medical Outcomes Partnership database. The resulting domain-aggregate and cluster plots show a variety of patterns. We review the patterns found in the condition domain and investigate the processes that shape them. We find that these patterns suggest data quality issues influenced by systemwide factors that affect individual concept frequencies. 2019-08-21 /pmc/articles/PMC6857180/ /pubmed/31437950 http://dx.doi.org/10.3233/SHTI190248 Text en http://creativecommons.org/licenses/by-nc/4.0/ This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
spellingShingle Article
Ta, Casey N.
Weng, Chunhua
Detecting Systemic Data Quality Issues in Electronic Health Records
title Detecting Systemic Data Quality Issues in Electronic Health Records
title_full Detecting Systemic Data Quality Issues in Electronic Health Records
title_fullStr Detecting Systemic Data Quality Issues in Electronic Health Records
title_full_unstemmed Detecting Systemic Data Quality Issues in Electronic Health Records
title_short Detecting Systemic Data Quality Issues in Electronic Health Records
title_sort detecting systemic data quality issues in electronic health records
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857180/
https://www.ncbi.nlm.nih.gov/pubmed/31437950
http://dx.doi.org/10.3233/SHTI190248
work_keys_str_mv AT tacaseyn detectingsystemicdataqualityissuesinelectronichealthrecords
AT wengchunhua detectingsystemicdataqualityissuesinelectronichealthrecords