Cargando…

The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study

BACKGROUND: Surveillance data are essential public health resources for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on nonrandom sample designs. Population estimates based on such data may be impacted by the underly...

Descripción completa

Detalles Bibliográficos
Autores principales:	Weiss, Paul Samuel, Waller, Lance Allyn
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2022
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9508670/ https://www.ncbi.nlm.nih.gov/pubmed/36083618 http://dx.doi.org/10.2196/37887

_version_	1784797066295246848
author	Weiss, Paul Samuel Waller, Lance Allyn
author_facet	Weiss, Paul Samuel Waller, Lance Allyn
author_sort	Weiss, Paul Samuel
collection	PubMed
description	BACKGROUND: Surveillance data are essential public health resources for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on nonrandom sample designs. Population estimates based on such data may be impacted by the underlying sample distribution compared to the true population of interest. In this study, we simulate a population of interest and allow response rates to vary in nonrandom ways to illustrate and measure the effect this has on population-based estimates of an important public health policy outcome. OBJECTIVE: The aim of this study was to illustrate the effect of nonrandom missingness on population-based survey sample estimation. METHODS: We simulated a population of respondents answering a survey question about their satisfaction with their community’s policy regarding vaccination mandates for government personnel. We allowed response rates to differ between the generally satisfied and dissatisfied and considered the effect of common efforts to control for potential bias such as sampling weights, sample size inflation, and hypothesis tests for determining missingness at random. We compared these conditions via mean squared errors and sampling variability to characterize the bias in estimation arising under these different approaches. RESULTS: Sample estimates present clear and quantifiable bias, even in the most favorable response profile. On a 5-point Likert scale, nonrandom missingness resulted in errors averaging to almost a full point away from the truth. Efforts to mitigate bias through sample size inflation and sampling weights have negligible effects on the overall results. Additionally, hypothesis testing for departures from random missingness rarely detect the nonrandom missingness across the widest range of response profiles considered. CONCLUSIONS: Our results suggest that assuming surveillance data are missing at random during analysis could provide estimates that are widely different from what we might see in the whole population. Policy decisions based on such potentially biased estimates could be devastating in terms of community disengagement and health disparities. Alternative approaches to analysis that move away from broad generalization of a mismeasured population at risk are necessary to identify the marginalized groups, where overall response may be very different from those observed in measured respondents.
format	Online Article Text
id	pubmed-9508670
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-95086702022-09-25 The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study Weiss, Paul Samuel Waller, Lance Allyn JMIR Public Health Surveill Original Paper BACKGROUND: Surveillance data are essential public health resources for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on nonrandom sample designs. Population estimates based on such data may be impacted by the underlying sample distribution compared to the true population of interest. In this study, we simulate a population of interest and allow response rates to vary in nonrandom ways to illustrate and measure the effect this has on population-based estimates of an important public health policy outcome. OBJECTIVE: The aim of this study was to illustrate the effect of nonrandom missingness on population-based survey sample estimation. METHODS: We simulated a population of respondents answering a survey question about their satisfaction with their community’s policy regarding vaccination mandates for government personnel. We allowed response rates to differ between the generally satisfied and dissatisfied and considered the effect of common efforts to control for potential bias such as sampling weights, sample size inflation, and hypothesis tests for determining missingness at random. We compared these conditions via mean squared errors and sampling variability to characterize the bias in estimation arising under these different approaches. RESULTS: Sample estimates present clear and quantifiable bias, even in the most favorable response profile. On a 5-point Likert scale, nonrandom missingness resulted in errors averaging to almost a full point away from the truth. Efforts to mitigate bias through sample size inflation and sampling weights have negligible effects on the overall results. Additionally, hypothesis testing for departures from random missingness rarely detect the nonrandom missingness across the widest range of response profiles considered. CONCLUSIONS: Our results suggest that assuming surveillance data are missing at random during analysis could provide estimates that are widely different from what we might see in the whole population. Policy decisions based on such potentially biased estimates could be devastating in terms of community disengagement and health disparities. Alternative approaches to analysis that move away from broad generalization of a mismeasured population at risk are necessary to identify the marginalized groups, where overall response may be very different from those observed in measured respondents. JMIR Publications 2022-09-09 /pmc/articles/PMC9508670/ /pubmed/36083618 http://dx.doi.org/10.2196/37887 Text en ©Paul Samuel Weiss, Lance Allyn Waller. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 09.09.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle	Original Paper Weiss, Paul Samuel Waller, Lance Allyn The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study
title	The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study
title_full	The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study
title_fullStr	The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study
title_full_unstemmed	The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study
title_short	The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study
title_sort	impact of nonrandom missingness in surveillance data for population-level summaries: simulation study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9508670/ https://www.ncbi.nlm.nih.gov/pubmed/36083618 http://dx.doi.org/10.2196/37887
work_keys_str_mv	AT weisspaulsamuel theimpactofnonrandommissingnessinsurveillancedataforpopulationlevelsummariessimulationstudy AT wallerlanceallyn theimpactofnonrandommissingnessinsurveillancedataforpopulationlevelsummariessimulationstudy AT weisspaulsamuel impactofnonrandommissingnessinsurveillancedataforpopulationlevelsummariessimulationstudy AT wallerlanceallyn impactofnonrandommissingnessinsurveillancedataforpopulationlevelsummariessimulationstudy

The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study

Ejemplares similares