Cargando…
Detecting Systemic Data Quality Issues in Electronic Health Records
Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology fo...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857180/ https://www.ncbi.nlm.nih.gov/pubmed/31437950 http://dx.doi.org/10.3233/SHTI190248 |
_version_ | 1783470712659378176 |
---|---|
author | Ta, Casey N. Weng, Chunhua |
author_facet | Ta, Casey N. Weng, Chunhua |
author_sort | Ta, Casey N. |
collection | PubMed |
description | Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology for data quality assessment by plotting domain-level (conditions (diagnoses), drugs, and procedures) aggregate statistics and concept-level temporal frequencies (i.e., annual prevalence rates of clinical concepts). We detect common temporal patterns in concept frequencies by normalizing and clustering annual concept frequencies using K-means clustering. We apply these methods to the Columbia University Irving Medical Center Observational Medical Outcomes Partnership database. The resulting domain-aggregate and cluster plots show a variety of patterns. We review the patterns found in the condition domain and investigate the processes that shape them. We find that these patterns suggest data quality issues influenced by systemwide factors that affect individual concept frequencies. |
format | Online Article Text |
id | pubmed-6857180 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
record_format | MEDLINE/PubMed |
spelling | pubmed-68571802019-11-15 Detecting Systemic Data Quality Issues in Electronic Health Records Ta, Casey N. Weng, Chunhua Stud Health Technol Inform Article Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology for data quality assessment by plotting domain-level (conditions (diagnoses), drugs, and procedures) aggregate statistics and concept-level temporal frequencies (i.e., annual prevalence rates of clinical concepts). We detect common temporal patterns in concept frequencies by normalizing and clustering annual concept frequencies using K-means clustering. We apply these methods to the Columbia University Irving Medical Center Observational Medical Outcomes Partnership database. The resulting domain-aggregate and cluster plots show a variety of patterns. We review the patterns found in the condition domain and investigate the processes that shape them. We find that these patterns suggest data quality issues influenced by systemwide factors that affect individual concept frequencies. 2019-08-21 /pmc/articles/PMC6857180/ /pubmed/31437950 http://dx.doi.org/10.3233/SHTI190248 Text en http://creativecommons.org/licenses/by-nc/4.0/ This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0). |
spellingShingle | Article Ta, Casey N. Weng, Chunhua Detecting Systemic Data Quality Issues in Electronic Health Records |
title | Detecting Systemic Data Quality Issues in Electronic Health Records |
title_full | Detecting Systemic Data Quality Issues in Electronic Health Records |
title_fullStr | Detecting Systemic Data Quality Issues in Electronic Health Records |
title_full_unstemmed | Detecting Systemic Data Quality Issues in Electronic Health Records |
title_short | Detecting Systemic Data Quality Issues in Electronic Health Records |
title_sort | detecting systemic data quality issues in electronic health records |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857180/ https://www.ncbi.nlm.nih.gov/pubmed/31437950 http://dx.doi.org/10.3233/SHTI190248 |
work_keys_str_mv | AT tacaseyn detectingsystemicdataqualityissuesinelectronichealthrecords AT wengchunhua detectingsystemicdataqualityissuesinelectronichealthrecords |