Cargando…

Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation

BACKGROUND: Selection bias and unmeasured confounding are fundamental problems in epidemiology that threaten study internal and external validity. These phenomena are particularly dangerous in internet-based public health surveillance, where traditional mitigation and adjustment methods are inapplic...

Descripción completa

Detalles Bibliográficos
Autores principales:	Stockham, Nathaniel, Washington, Peter, Chrisman, Brianna, Paskov, Kelley, Jung, Jae-Yoon, Wall, Dennis Paul
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2022
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9307267/ https://www.ncbi.nlm.nih.gov/pubmed/35605128 http://dx.doi.org/10.2196/31306

_version_	1784752719729262592
author	Stockham, Nathaniel Washington, Peter Chrisman, Brianna Paskov, Kelley Jung, Jae-Yoon Wall, Dennis Paul
author_facet	Stockham, Nathaniel Washington, Peter Chrisman, Brianna Paskov, Kelley Jung, Jae-Yoon Wall, Dennis Paul
author_sort	Stockham, Nathaniel
collection	PubMed
description	BACKGROUND: Selection bias and unmeasured confounding are fundamental problems in epidemiology that threaten study internal and external validity. These phenomena are particularly dangerous in internet-based public health surveillance, where traditional mitigation and adjustment methods are inapplicable, unavailable, or out of date. Recent theoretical advances in causal modeling can mitigate these threats, but these innovations have not been widely deployed in the epidemiological community. OBJECTIVE: The purpose of our paper is to demonstrate the practical utility of causal modeling to both detect unmeasured confounding and selection bias and guide model selection to minimize bias. We implemented this approach in an applied epidemiological study of the COVID-19 cumulative infection rate in the New York City (NYC) spring 2020 epidemic. METHODS: We collected primary data from Qualtrics surveys of Amazon Mechanical Turk (MTurk) crowd workers residing in New Jersey and New York State across 2 sampling periods: April 11-14 and May 8-11, 2020. The surveys queried the subjects on household health status and demographic characteristics. We constructed a set of possible causal models of household infection and survey selection mechanisms and ranked them by compatibility with the collected survey data. The most compatible causal model was then used to estimate the cumulative infection rate in each survey period. RESULTS: There were 527 and 513 responses collected for the 2 periods, respectively. Response demographics were highly skewed toward a younger age in both survey periods. Despite the extremely strong relationship between age and COVID-19 symptoms, we recovered minimally biased estimates of the cumulative infection rate using only primary data and the most compatible causal model, with a relative bias of +3.8% and –1.9% from the reported cumulative infection rate for the first and second survey periods, respectively. CONCLUSIONS: We successfully recovered accurate estimates of the cumulative infection rate from an internet-based crowdsourced sample despite considerable selection bias and unmeasured confounding in the primary data. This implementation demonstrates how simple applications of structural causal modeling can be effectively used to determine falsifiable model conditions, detect selection bias and confounding factors, and minimize estimate bias through model selection in a novel epidemiological context. As the disease and social dynamics of COVID-19 continue to evolve, public health surveillance protocols must continue to adapt; the emergence of Omicron variants and shift to at-home testing as recent challenges. Rigorous and transparent methods to develop, deploy, and diagnosis adapted surveillance protocols will be critical to their success.
format	Online Article Text
id	pubmed-9307267
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-93072672022-07-23 Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation Stockham, Nathaniel Washington, Peter Chrisman, Brianna Paskov, Kelley Jung, Jae-Yoon Wall, Dennis Paul JMIR Public Health Surveill Original Paper BACKGROUND: Selection bias and unmeasured confounding are fundamental problems in epidemiology that threaten study internal and external validity. These phenomena are particularly dangerous in internet-based public health surveillance, where traditional mitigation and adjustment methods are inapplicable, unavailable, or out of date. Recent theoretical advances in causal modeling can mitigate these threats, but these innovations have not been widely deployed in the epidemiological community. OBJECTIVE: The purpose of our paper is to demonstrate the practical utility of causal modeling to both detect unmeasured confounding and selection bias and guide model selection to minimize bias. We implemented this approach in an applied epidemiological study of the COVID-19 cumulative infection rate in the New York City (NYC) spring 2020 epidemic. METHODS: We collected primary data from Qualtrics surveys of Amazon Mechanical Turk (MTurk) crowd workers residing in New Jersey and New York State across 2 sampling periods: April 11-14 and May 8-11, 2020. The surveys queried the subjects on household health status and demographic characteristics. We constructed a set of possible causal models of household infection and survey selection mechanisms and ranked them by compatibility with the collected survey data. The most compatible causal model was then used to estimate the cumulative infection rate in each survey period. RESULTS: There were 527 and 513 responses collected for the 2 periods, respectively. Response demographics were highly skewed toward a younger age in both survey periods. Despite the extremely strong relationship between age and COVID-19 symptoms, we recovered minimally biased estimates of the cumulative infection rate using only primary data and the most compatible causal model, with a relative bias of +3.8% and –1.9% from the reported cumulative infection rate for the first and second survey periods, respectively. CONCLUSIONS: We successfully recovered accurate estimates of the cumulative infection rate from an internet-based crowdsourced sample despite considerable selection bias and unmeasured confounding in the primary data. This implementation demonstrates how simple applications of structural causal modeling can be effectively used to determine falsifiable model conditions, detect selection bias and confounding factors, and minimize estimate bias through model selection in a novel epidemiological context. As the disease and social dynamics of COVID-19 continue to evolve, public health surveillance protocols must continue to adapt; the emergence of Omicron variants and shift to at-home testing as recent challenges. Rigorous and transparent methods to develop, deploy, and diagnosis adapted surveillance protocols will be critical to their success. JMIR Publications 2022-07-21 /pmc/articles/PMC9307267/ /pubmed/35605128 http://dx.doi.org/10.2196/31306 Text en ©Nathaniel Stockham, Peter Washington, Brianna Chrisman, Kelley Paskov, Jae-Yoon Jung, Dennis Paul Wall. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 21.07.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle	Original Paper Stockham, Nathaniel Washington, Peter Chrisman, Brianna Paskov, Kelley Jung, Jae-Yoon Wall, Dennis Paul Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation
title	Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation
title_full	Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation
title_fullStr	Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation
title_full_unstemmed	Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation
title_short	Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation
title_sort	causal modeling to mitigate selection bias and unmeasured confounding in internet-based epidemiology of covid-19: model development and validation
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9307267/ https://www.ncbi.nlm.nih.gov/pubmed/35605128 http://dx.doi.org/10.2196/31306
work_keys_str_mv	AT stockhamnathaniel causalmodelingtomitigateselectionbiasandunmeasuredconfoundingininternetbasedepidemiologyofcovid19modeldevelopmentandvalidation AT washingtonpeter causalmodelingtomitigateselectionbiasandunmeasuredconfoundingininternetbasedepidemiologyofcovid19modeldevelopmentandvalidation AT chrismanbrianna causalmodelingtomitigateselectionbiasandunmeasuredconfoundingininternetbasedepidemiologyofcovid19modeldevelopmentandvalidation AT paskovkelley causalmodelingtomitigateselectionbiasandunmeasuredconfoundingininternetbasedepidemiologyofcovid19modeldevelopmentandvalidation AT jungjaeyoon causalmodelingtomitigateselectionbiasandunmeasuredconfoundingininternetbasedepidemiologyofcovid19modeldevelopmentandvalidation AT walldennispaul causalmodelingtomitigateselectionbiasandunmeasuredconfoundingininternetbasedepidemiologyofcovid19modeldevelopmentandvalidation

Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation

Ejemplares similares