Cargando…

The Challenges of Data Quality Evaluation in a Joint Data Warehouse

INTRODUCTION: The use of clinically derived data from electronic health records (EHRs) and other electronic clinical systems can greatly facilitate clinical research as well as operational and quality initiatives. One approach for making these data available is to incorporate data from different sou...

Descripción completa

Detalles Bibliográficos
Autores principales: Bae, Charles J., Griffith, Sandra, Fan, Youran, Dunphy, Cheryl, Thompson, Nicolas, Urchek, John, Parchman, Alandra, Katzan, Irene L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AcademyHealth 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4537084/
https://www.ncbi.nlm.nih.gov/pubmed/26290882
http://dx.doi.org/10.13063/2327-9214.1125
_version_ 1782385843605864448
author Bae, Charles J.
Griffith, Sandra
Fan, Youran
Dunphy, Cheryl
Thompson, Nicolas
Urchek, John
Parchman, Alandra
Katzan, Irene L.
author_facet Bae, Charles J.
Griffith, Sandra
Fan, Youran
Dunphy, Cheryl
Thompson, Nicolas
Urchek, John
Parchman, Alandra
Katzan, Irene L.
author_sort Bae, Charles J.
collection PubMed
description INTRODUCTION: The use of clinically derived data from electronic health records (EHRs) and other electronic clinical systems can greatly facilitate clinical research as well as operational and quality initiatives. One approach for making these data available is to incorporate data from different sources into a joint data warehouse. When using such a data warehouse, it is important to understand the quality of the data. The primary objective of this study was to determine the completeness and concordance of common types of clinical data available in the Knowledge Program (KP) joint data warehouse, which contains feeds from several electronic systems including the EHR. METHODS: A manual review was performed of specific data elements for 250 patients from an EHR, and these were compared with corresponding elements in the KP data warehouse. Completeness and concordance were calculated for five categories of data including demographics, vital signs, laboratory results, diagnoses, and medications. RESULTS: In general, data elements for demographics, vital signs, diagnoses, and laboratory results were present in more cases in the source EHR compared to the KP. When data elements were available in both sources, there was a high concordance. In contrast, the KP data warehouse documented a higher prevalence of deaths and medications compared to the EHR. DISCUSSION: Several factors contributed to the discrepancies between data in the KP and the EHR—including the start date and frequency of data feeds updates into the KP, inability to transfer data located in nonstructured formats (e.g., free text or scanned documents), as well as incomplete and missing data variables in the source EHR. CONCLUSION: When evaluating the quality of a data warehouse with multiple data sources, assessing completeness and concordance between data set and source data may be better than designating one to be a gold standard. This will allow the user to optimize the method and timing of data transfer in order to capture data with better accuracy.
format Online
Article
Text
id pubmed-4537084
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher AcademyHealth
record_format MEDLINE/PubMed
spelling pubmed-45370842015-08-19 The Challenges of Data Quality Evaluation in a Joint Data Warehouse Bae, Charles J. Griffith, Sandra Fan, Youran Dunphy, Cheryl Thompson, Nicolas Urchek, John Parchman, Alandra Katzan, Irene L. EGEMS (Wash DC) Articles INTRODUCTION: The use of clinically derived data from electronic health records (EHRs) and other electronic clinical systems can greatly facilitate clinical research as well as operational and quality initiatives. One approach for making these data available is to incorporate data from different sources into a joint data warehouse. When using such a data warehouse, it is important to understand the quality of the data. The primary objective of this study was to determine the completeness and concordance of common types of clinical data available in the Knowledge Program (KP) joint data warehouse, which contains feeds from several electronic systems including the EHR. METHODS: A manual review was performed of specific data elements for 250 patients from an EHR, and these were compared with corresponding elements in the KP data warehouse. Completeness and concordance were calculated for five categories of data including demographics, vital signs, laboratory results, diagnoses, and medications. RESULTS: In general, data elements for demographics, vital signs, diagnoses, and laboratory results were present in more cases in the source EHR compared to the KP. When data elements were available in both sources, there was a high concordance. In contrast, the KP data warehouse documented a higher prevalence of deaths and medications compared to the EHR. DISCUSSION: Several factors contributed to the discrepancies between data in the KP and the EHR—including the start date and frequency of data feeds updates into the KP, inability to transfer data located in nonstructured formats (e.g., free text or scanned documents), as well as incomplete and missing data variables in the source EHR. CONCLUSION: When evaluating the quality of a data warehouse with multiple data sources, assessing completeness and concordance between data set and source data may be better than designating one to be a gold standard. This will allow the user to optimize the method and timing of data transfer in order to capture data with better accuracy. AcademyHealth 2015-05-22 /pmc/articles/PMC4537084/ /pubmed/26290882 http://dx.doi.org/10.13063/2327-9214.1125 Text en All eGEMs publications are licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License http://creativecommons.org/licenses/by-nc-nd/3.0/
spellingShingle Articles
Bae, Charles J.
Griffith, Sandra
Fan, Youran
Dunphy, Cheryl
Thompson, Nicolas
Urchek, John
Parchman, Alandra
Katzan, Irene L.
The Challenges of Data Quality Evaluation in a Joint Data Warehouse
title The Challenges of Data Quality Evaluation in a Joint Data Warehouse
title_full The Challenges of Data Quality Evaluation in a Joint Data Warehouse
title_fullStr The Challenges of Data Quality Evaluation in a Joint Data Warehouse
title_full_unstemmed The Challenges of Data Quality Evaluation in a Joint Data Warehouse
title_short The Challenges of Data Quality Evaluation in a Joint Data Warehouse
title_sort challenges of data quality evaluation in a joint data warehouse
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4537084/
https://www.ncbi.nlm.nih.gov/pubmed/26290882
http://dx.doi.org/10.13063/2327-9214.1125
work_keys_str_mv AT baecharlesj thechallengesofdataqualityevaluationinajointdatawarehouse
AT griffithsandra thechallengesofdataqualityevaluationinajointdatawarehouse
AT fanyouran thechallengesofdataqualityevaluationinajointdatawarehouse
AT dunphycheryl thechallengesofdataqualityevaluationinajointdatawarehouse
AT thompsonnicolas thechallengesofdataqualityevaluationinajointdatawarehouse
AT urchekjohn thechallengesofdataqualityevaluationinajointdatawarehouse
AT parchmanalandra thechallengesofdataqualityevaluationinajointdatawarehouse
AT katzanirenel thechallengesofdataqualityevaluationinajointdatawarehouse
AT baecharlesj challengesofdataqualityevaluationinajointdatawarehouse
AT griffithsandra challengesofdataqualityevaluationinajointdatawarehouse
AT fanyouran challengesofdataqualityevaluationinajointdatawarehouse
AT dunphycheryl challengesofdataqualityevaluationinajointdatawarehouse
AT thompsonnicolas challengesofdataqualityevaluationinajointdatawarehouse
AT urchekjohn challengesofdataqualityevaluationinajointdatawarehouse
AT parchmanalandra challengesofdataqualityevaluationinajointdatawarehouse
AT katzanirenel challengesofdataqualityevaluationinajointdatawarehouse