Cargando…
The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
BACKGROUND: Data audits are often evaluated soon after completion, even though the identification of systematic issues may lead to additional data quality improvements in the future. In this study, we assess the impact of the entire data audit process on subsequent statistical analyses. METHODS: We...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937856/ https://www.ncbi.nlm.nih.gov/pubmed/31888571 http://dx.doi.org/10.1186/s12889-019-8105-2 |
_version_ | 1783483951662235648 |
---|---|
author | Giganti, Mark J. Shepherd, Bryan E. Caro-Vega, Yanink Luz, Paula M. Rebeiro, Peter F. Maia, Marcelle Julmiste, Gaetane Cortes, Claudia McGowan, Catherine C. Duda, Stephany N. |
author_facet | Giganti, Mark J. Shepherd, Bryan E. Caro-Vega, Yanink Luz, Paula M. Rebeiro, Peter F. Maia, Marcelle Julmiste, Gaetane Cortes, Claudia McGowan, Catherine C. Duda, Stephany N. |
author_sort | Giganti, Mark J. |
collection | PubMed |
description | BACKGROUND: Data audits are often evaluated soon after completion, even though the identification of systematic issues may lead to additional data quality improvements in the future. In this study, we assess the impact of the entire data audit process on subsequent statistical analyses. METHODS: We conducted on-site audits of datasets from nine international HIV care sites. Error rates were quantified for key demographic and clinical variables among a subset of records randomly selected for auditing. Based on audit results, some sites were tasked with targeted validation of high-error-rate variables resulting in a post-audit dataset. We estimated the times from antiretroviral therapy initiation until death and first AIDS-defining event using the pre-audit data, the audit data, and the post-audit data. RESULTS: The overall discrepancy rate between pre-audit and audit data (n = 250) across all audited variables was 17.1%. The estimated probability of mortality and an AIDS-defining event over time was higher in the audited data relative to the pre-audit data. Among patients represented in both the post-audit and pre-audit cohorts (n = 18,999), AIDS and mortality estimates also were higher in the post-audit data. CONCLUSION: Though some changes may have occurred independently, our findings suggest that improved data quality following the audit may impact epidemiological inferences. |
format | Online Article Text |
id | pubmed-6937856 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69378562019-12-31 The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data Giganti, Mark J. Shepherd, Bryan E. Caro-Vega, Yanink Luz, Paula M. Rebeiro, Peter F. Maia, Marcelle Julmiste, Gaetane Cortes, Claudia McGowan, Catherine C. Duda, Stephany N. BMC Public Health Research Article BACKGROUND: Data audits are often evaluated soon after completion, even though the identification of systematic issues may lead to additional data quality improvements in the future. In this study, we assess the impact of the entire data audit process on subsequent statistical analyses. METHODS: We conducted on-site audits of datasets from nine international HIV care sites. Error rates were quantified for key demographic and clinical variables among a subset of records randomly selected for auditing. Based on audit results, some sites were tasked with targeted validation of high-error-rate variables resulting in a post-audit dataset. We estimated the times from antiretroviral therapy initiation until death and first AIDS-defining event using the pre-audit data, the audit data, and the post-audit data. RESULTS: The overall discrepancy rate between pre-audit and audit data (n = 250) across all audited variables was 17.1%. The estimated probability of mortality and an AIDS-defining event over time was higher in the audited data relative to the pre-audit data. Among patients represented in both the post-audit and pre-audit cohorts (n = 18,999), AIDS and mortality estimates also were higher in the post-audit data. CONCLUSION: Though some changes may have occurred independently, our findings suggest that improved data quality following the audit may impact epidemiological inferences. BioMed Central 2019-12-30 /pmc/articles/PMC6937856/ /pubmed/31888571 http://dx.doi.org/10.1186/s12889-019-8105-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Giganti, Mark J. Shepherd, Bryan E. Caro-Vega, Yanink Luz, Paula M. Rebeiro, Peter F. Maia, Marcelle Julmiste, Gaetane Cortes, Claudia McGowan, Catherine C. Duda, Stephany N. The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data |
title | The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data |
title_full | The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data |
title_fullStr | The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data |
title_full_unstemmed | The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data |
title_short | The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data |
title_sort | impact of data quality and source data verification on epidemiologic inference: a practical application using hiv observational data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937856/ https://www.ncbi.nlm.nih.gov/pubmed/31888571 http://dx.doi.org/10.1186/s12889-019-8105-2 |
work_keys_str_mv | AT gigantimarkj theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT shepherdbryane theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT carovegayanink theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT luzpaulam theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT rebeiropeterf theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT maiamarcelle theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT julmistegaetane theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT cortesclaudia theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT mcgowancatherinec theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT dudastephanyn theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT gigantimarkj impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT shepherdbryane impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT carovegayanink impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT luzpaulam impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT rebeiropeterf impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT maiamarcelle impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT julmistegaetane impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT cortesclaudia impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT mcgowancatherinec impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata AT dudastephanyn impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata |