Cargando…
Data Quality: A Systematic Review of the Biosurveillance Literature
OBJECTIVE: To highlight how data quality has been discussed in the biosurveillance literature in order to identify current gaps in knowledge and areas for future research. INTRODUCTION: Data quality monitoring is necessary for accurate disease surveillance. However it can be challenging, especially...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
University of Illinois at Chicago Library
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692854/ |
_version_ | 1782274671450783744 |
---|---|
author | Reynolds, Tera Painter, Ian Streichert, Laura |
author_facet | Reynolds, Tera Painter, Ian Streichert, Laura |
author_sort | Reynolds, Tera |
collection | PubMed |
description | OBJECTIVE: To highlight how data quality has been discussed in the biosurveillance literature in order to identify current gaps in knowledge and areas for future research. INTRODUCTION: Data quality monitoring is necessary for accurate disease surveillance. However it can be challenging, especially when “real-time” data are required. Data quality has been broadly defined as the degree to which data are suitable for use by data consumers [1]. When compromised at any point in a health information system, data of low quality can impair the detection of data anomalies, delay the response to emerging health threats [2], and result in inefficient use of staff and financial resources. While the impacts of poor data quality on biosurveillance are largely unknown, and vary depending on field and business processes, the information management literature includes estimates for increased costs amounting to 8–12% of organizational revenue and, in general, poorer decisions that take longer to make [3]. METHODS: -How has data quality been defined and/or discussed? -What measurements of data quality have been utilized? -What methods for monitoring data quality have been utilized? -What methods have been used to mitigate data quality issues? -What steps have been taken to improve data quality? The search included PubMed, ISDS and AMIA Conference Proceedings, and reference lists. PubMed was searched using the terms “data quality,” “biosurveillance,” “information visualization,” “quality control,” “health data,” and “missing data.” The titles and abstracts of all search results were assessed for relevance and relevant articles were reviewed using the structured matrix. RESULTS: The completeness of data capture is the most commonly measured dimension of data quality discussed in the literature (other variables include timeliness and accuracy). The methods for detecting data quality issues fall into two broad categories: (1) methods for regular monitoring to identify data quality issues and (2) methods that are utilized for ad hoc assessments of data quality. Methods for regular monitoring of data quality are more likely to be automated and focused on visualization, compared with the methods described as part of special evaluations or studies, which tend to include more manual validation. Improving data quality involves the identification and correction of data errors that already exist in the system using either manual or automated data cleansing techniques [4]. Several methods of improving data quality were discussed in the public health surveillance literature, including development of an address verification algorithm that identifies an alternative, valid address [5], and manual correction of the contents of databases [6]. Communication with the data entry personnel or data providers, either on a regular basis (e.g., annual report) or when systematic data entry errors are identified, was mentioned in the literature as the most common step to prevent data quality issues. CONCLUSIONS: In reviewing the biosurveillance literature in the context of the data quality field, the largest gap appears to be that the data quality methods discussed in literature are often ad hoc and not consistently implemented. Developing a data quality program to identify the causes of lower quality health data, address data quality problems, and prevent issues would allow public health departments to more efficiently and effectively conduct biosurveillance and to apply results to improving public health practice. |
format | Online Article Text |
id | pubmed-3692854 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | University of Illinois at Chicago Library |
record_format | MEDLINE/PubMed |
spelling | pubmed-36928542013-06-26 Data Quality: A Systematic Review of the Biosurveillance Literature Reynolds, Tera Painter, Ian Streichert, Laura Online J Public Health Inform ISDS 2012 Conference Abstracts OBJECTIVE: To highlight how data quality has been discussed in the biosurveillance literature in order to identify current gaps in knowledge and areas for future research. INTRODUCTION: Data quality monitoring is necessary for accurate disease surveillance. However it can be challenging, especially when “real-time” data are required. Data quality has been broadly defined as the degree to which data are suitable for use by data consumers [1]. When compromised at any point in a health information system, data of low quality can impair the detection of data anomalies, delay the response to emerging health threats [2], and result in inefficient use of staff and financial resources. While the impacts of poor data quality on biosurveillance are largely unknown, and vary depending on field and business processes, the information management literature includes estimates for increased costs amounting to 8–12% of organizational revenue and, in general, poorer decisions that take longer to make [3]. METHODS: -How has data quality been defined and/or discussed? -What measurements of data quality have been utilized? -What methods for monitoring data quality have been utilized? -What methods have been used to mitigate data quality issues? -What steps have been taken to improve data quality? The search included PubMed, ISDS and AMIA Conference Proceedings, and reference lists. PubMed was searched using the terms “data quality,” “biosurveillance,” “information visualization,” “quality control,” “health data,” and “missing data.” The titles and abstracts of all search results were assessed for relevance and relevant articles were reviewed using the structured matrix. RESULTS: The completeness of data capture is the most commonly measured dimension of data quality discussed in the literature (other variables include timeliness and accuracy). The methods for detecting data quality issues fall into two broad categories: (1) methods for regular monitoring to identify data quality issues and (2) methods that are utilized for ad hoc assessments of data quality. Methods for regular monitoring of data quality are more likely to be automated and focused on visualization, compared with the methods described as part of special evaluations or studies, which tend to include more manual validation. Improving data quality involves the identification and correction of data errors that already exist in the system using either manual or automated data cleansing techniques [4]. Several methods of improving data quality were discussed in the public health surveillance literature, including development of an address verification algorithm that identifies an alternative, valid address [5], and manual correction of the contents of databases [6]. Communication with the data entry personnel or data providers, either on a regular basis (e.g., annual report) or when systematic data entry errors are identified, was mentioned in the literature as the most common step to prevent data quality issues. CONCLUSIONS: In reviewing the biosurveillance literature in the context of the data quality field, the largest gap appears to be that the data quality methods discussed in literature are often ad hoc and not consistently implemented. Developing a data quality program to identify the causes of lower quality health data, address data quality problems, and prevent issues would allow public health departments to more efficiently and effectively conduct biosurveillance and to apply results to improving public health practice. University of Illinois at Chicago Library 2013-04-04 /pmc/articles/PMC3692854/ Text en ©2013 the author(s) http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/ojphi/about/submissions#copyrightNotice This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes. |
spellingShingle | ISDS 2012 Conference Abstracts Reynolds, Tera Painter, Ian Streichert, Laura Data Quality: A Systematic Review of the Biosurveillance Literature |
title | Data Quality: A Systematic Review of the Biosurveillance Literature |
title_full | Data Quality: A Systematic Review of the Biosurveillance Literature |
title_fullStr | Data Quality: A Systematic Review of the Biosurveillance Literature |
title_full_unstemmed | Data Quality: A Systematic Review of the Biosurveillance Literature |
title_short | Data Quality: A Systematic Review of the Biosurveillance Literature |
title_sort | data quality: a systematic review of the biosurveillance literature |
topic | ISDS 2012 Conference Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692854/ |
work_keys_str_mv | AT reynoldstera dataqualityasystematicreviewofthebiosurveillanceliterature AT painterian dataqualityasystematicreviewofthebiosurveillanceliterature AT streichertlaura dataqualityasystematicreviewofthebiosurveillanceliterature |