Cargando…

The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data

BACKGROUND: Data audits are often evaluated soon after completion, even though the identification of systematic issues may lead to additional data quality improvements in the future. In this study, we assess the impact of the entire data audit process on subsequent statistical analyses. METHODS: We...

Descripción completa

Detalles Bibliográficos
Autores principales: Giganti, Mark J., Shepherd, Bryan E., Caro-Vega, Yanink, Luz, Paula M., Rebeiro, Peter F., Maia, Marcelle, Julmiste, Gaetane, Cortes, Claudia, McGowan, Catherine C., Duda, Stephany N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937856/
https://www.ncbi.nlm.nih.gov/pubmed/31888571
http://dx.doi.org/10.1186/s12889-019-8105-2
_version_ 1783483951662235648
author Giganti, Mark J.
Shepherd, Bryan E.
Caro-Vega, Yanink
Luz, Paula M.
Rebeiro, Peter F.
Maia, Marcelle
Julmiste, Gaetane
Cortes, Claudia
McGowan, Catherine C.
Duda, Stephany N.
author_facet Giganti, Mark J.
Shepherd, Bryan E.
Caro-Vega, Yanink
Luz, Paula M.
Rebeiro, Peter F.
Maia, Marcelle
Julmiste, Gaetane
Cortes, Claudia
McGowan, Catherine C.
Duda, Stephany N.
author_sort Giganti, Mark J.
collection PubMed
description BACKGROUND: Data audits are often evaluated soon after completion, even though the identification of systematic issues may lead to additional data quality improvements in the future. In this study, we assess the impact of the entire data audit process on subsequent statistical analyses. METHODS: We conducted on-site audits of datasets from nine international HIV care sites. Error rates were quantified for key demographic and clinical variables among a subset of records randomly selected for auditing. Based on audit results, some sites were tasked with targeted validation of high-error-rate variables resulting in a post-audit dataset. We estimated the times from antiretroviral therapy initiation until death and first AIDS-defining event using the pre-audit data, the audit data, and the post-audit data. RESULTS: The overall discrepancy rate between pre-audit and audit data (n = 250) across all audited variables was 17.1%. The estimated probability of mortality and an AIDS-defining event over time was higher in the audited data relative to the pre-audit data. Among patients represented in both the post-audit and pre-audit cohorts (n = 18,999), AIDS and mortality estimates also were higher in the post-audit data. CONCLUSION: Though some changes may have occurred independently, our findings suggest that improved data quality following the audit may impact epidemiological inferences.
format Online
Article
Text
id pubmed-6937856
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69378562019-12-31 The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data Giganti, Mark J. Shepherd, Bryan E. Caro-Vega, Yanink Luz, Paula M. Rebeiro, Peter F. Maia, Marcelle Julmiste, Gaetane Cortes, Claudia McGowan, Catherine C. Duda, Stephany N. BMC Public Health Research Article BACKGROUND: Data audits are often evaluated soon after completion, even though the identification of systematic issues may lead to additional data quality improvements in the future. In this study, we assess the impact of the entire data audit process on subsequent statistical analyses. METHODS: We conducted on-site audits of datasets from nine international HIV care sites. Error rates were quantified for key demographic and clinical variables among a subset of records randomly selected for auditing. Based on audit results, some sites were tasked with targeted validation of high-error-rate variables resulting in a post-audit dataset. We estimated the times from antiretroviral therapy initiation until death and first AIDS-defining event using the pre-audit data, the audit data, and the post-audit data. RESULTS: The overall discrepancy rate between pre-audit and audit data (n = 250) across all audited variables was 17.1%. The estimated probability of mortality and an AIDS-defining event over time was higher in the audited data relative to the pre-audit data. Among patients represented in both the post-audit and pre-audit cohorts (n = 18,999), AIDS and mortality estimates also were higher in the post-audit data. CONCLUSION: Though some changes may have occurred independently, our findings suggest that improved data quality following the audit may impact epidemiological inferences. BioMed Central 2019-12-30 /pmc/articles/PMC6937856/ /pubmed/31888571 http://dx.doi.org/10.1186/s12889-019-8105-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Giganti, Mark J.
Shepherd, Bryan E.
Caro-Vega, Yanink
Luz, Paula M.
Rebeiro, Peter F.
Maia, Marcelle
Julmiste, Gaetane
Cortes, Claudia
McGowan, Catherine C.
Duda, Stephany N.
The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
title The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
title_full The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
title_fullStr The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
title_full_unstemmed The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
title_short The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data
title_sort impact of data quality and source data verification on epidemiologic inference: a practical application using hiv observational data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6937856/
https://www.ncbi.nlm.nih.gov/pubmed/31888571
http://dx.doi.org/10.1186/s12889-019-8105-2
work_keys_str_mv AT gigantimarkj theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT shepherdbryane theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT carovegayanink theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT luzpaulam theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT rebeiropeterf theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT maiamarcelle theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT julmistegaetane theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT cortesclaudia theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT mcgowancatherinec theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT dudastephanyn theimpactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT gigantimarkj impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT shepherdbryane impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT carovegayanink impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT luzpaulam impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT rebeiropeterf impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT maiamarcelle impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT julmistegaetane impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT cortesclaudia impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT mcgowancatherinec impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata
AT dudastephanyn impactofdataqualityandsourcedataverificationonepidemiologicinferenceapracticalapplicationusinghivobservationaldata