Cargando…

Estimation and correction of bias in network simulations based on respondent-driven sampling data

Respondent-driven sampling (RDS) is widely used for collecting data on hard-to-reach populations, including information about the structure of the networks connecting the individuals. Characterizing network features can be important for designing and evaluating health programs, particularly those th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhu, Lin, Menzies, Nicolas A., Wang, Jianing, Linas, Benjamin P., Goodreau, Steven M., Salomon, Joshua A.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7156755/ https://www.ncbi.nlm.nih.gov/pubmed/32286412 http://dx.doi.org/10.1038/s41598-020-63269-0

_version_	1783522280826994688
author	Zhu, Lin Menzies, Nicolas A. Wang, Jianing Linas, Benjamin P. Goodreau, Steven M. Salomon, Joshua A.
author_facet	Zhu, Lin Menzies, Nicolas A. Wang, Jianing Linas, Benjamin P. Goodreau, Steven M. Salomon, Joshua A.
author_sort	Zhu, Lin
collection	PubMed
description	Respondent-driven sampling (RDS) is widely used for collecting data on hard-to-reach populations, including information about the structure of the networks connecting the individuals. Characterizing network features can be important for designing and evaluating health programs, particularly those that involve infectious disease transmission. While the validity of population proportions estimated from RDS-based datasets has been well studied, little is known about potential biases in inference about network structure from RDS. We developed a mathematical and statistical platform to simulate network structures with exponential random graph models, and to mimic the data generation mechanisms produced by RDS. We used this framework to characterize biases in three important network statistics – density/mean degree, homophily, and transitivity. Generalized linear models were used to predict the network statistics of the original network from the network statistics of the sample network and observable sample design features. We found that RDS may introduce significant biases in the estimation of density/mean degree and transitivity, and may exaggerate homophily when preferential recruitment occurs. Adjustments to network-generating statistics derived from the prediction models could substantially improve validity of simulated networks in terms of density, and could reduce bias in replicating mean degree, homophily, and transitivity from the original network.
format	Online Article Text
id	pubmed-7156755
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-71567552020-04-22 Estimation and correction of bias in network simulations based on respondent-driven sampling data Zhu, Lin Menzies, Nicolas A. Wang, Jianing Linas, Benjamin P. Goodreau, Steven M. Salomon, Joshua A. Sci Rep Article Respondent-driven sampling (RDS) is widely used for collecting data on hard-to-reach populations, including information about the structure of the networks connecting the individuals. Characterizing network features can be important for designing and evaluating health programs, particularly those that involve infectious disease transmission. While the validity of population proportions estimated from RDS-based datasets has been well studied, little is known about potential biases in inference about network structure from RDS. We developed a mathematical and statistical platform to simulate network structures with exponential random graph models, and to mimic the data generation mechanisms produced by RDS. We used this framework to characterize biases in three important network statistics – density/mean degree, homophily, and transitivity. Generalized linear models were used to predict the network statistics of the original network from the network statistics of the sample network and observable sample design features. We found that RDS may introduce significant biases in the estimation of density/mean degree and transitivity, and may exaggerate homophily when preferential recruitment occurs. Adjustments to network-generating statistics derived from the prediction models could substantially improve validity of simulated networks in terms of density, and could reduce bias in replicating mean degree, homophily, and transitivity from the original network. Nature Publishing Group UK 2020-04-14 /pmc/articles/PMC7156755/ /pubmed/32286412 http://dx.doi.org/10.1038/s41598-020-63269-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Zhu, Lin Menzies, Nicolas A. Wang, Jianing Linas, Benjamin P. Goodreau, Steven M. Salomon, Joshua A. Estimation and correction of bias in network simulations based on respondent-driven sampling data
title	Estimation and correction of bias in network simulations based on respondent-driven sampling data
title_full	Estimation and correction of bias in network simulations based on respondent-driven sampling data
title_fullStr	Estimation and correction of bias in network simulations based on respondent-driven sampling data
title_full_unstemmed	Estimation and correction of bias in network simulations based on respondent-driven sampling data
title_short	Estimation and correction of bias in network simulations based on respondent-driven sampling data
title_sort	estimation and correction of bias in network simulations based on respondent-driven sampling data
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7156755/ https://www.ncbi.nlm.nih.gov/pubmed/32286412 http://dx.doi.org/10.1038/s41598-020-63269-0
work_keys_str_mv	AT zhulin estimationandcorrectionofbiasinnetworksimulationsbasedonrespondentdrivensamplingdata AT menziesnicolasa estimationandcorrectionofbiasinnetworksimulationsbasedonrespondentdrivensamplingdata AT wangjianing estimationandcorrectionofbiasinnetworksimulationsbasedonrespondentdrivensamplingdata AT linasbenjaminp estimationandcorrectionofbiasinnetworksimulationsbasedonrespondentdrivensamplingdata AT goodreaustevenm estimationandcorrectionofbiasinnetworksimulationsbasedonrespondentdrivensamplingdata AT salomonjoshuaa estimationandcorrectionofbiasinnetworksimulationsbasedonrespondentdrivensamplingdata

Estimation and correction of bias in network simulations based on respondent-driven sampling data

Ejemplares similares