Cargando…

Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study

BACKGROUND: It is unclear whether weighted or unweighted regression is preferred in the analysis of data derived from respondent driven sampling. Our objective was to evaluate the validity of various regression models, with and without weights and with various controls for clustering in the estimati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Avery, Lisa, Rotondi, Nooshin, McKnight, Constance, Firestone, Michelle, Smylie, Janet, Rotondi, Michael
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819607/ https://www.ncbi.nlm.nih.gov/pubmed/31664912 http://dx.doi.org/10.1186/s12874-019-0842-5

_version_	1783463773005152256
author	Avery, Lisa Rotondi, Nooshin McKnight, Constance Firestone, Michelle Smylie, Janet Rotondi, Michael
author_facet	Avery, Lisa Rotondi, Nooshin McKnight, Constance Firestone, Michelle Smylie, Janet Rotondi, Michael
author_sort	Avery, Lisa
collection	PubMed
description	BACKGROUND: It is unclear whether weighted or unweighted regression is preferred in the analysis of data derived from respondent driven sampling. Our objective was to evaluate the validity of various regression models, with and without weights and with various controls for clustering in the estimation of the risk of group membership from data collected using respondent-driven sampling (RDS). METHODS: Twelve networked populations, with varying levels of homophily and prevalence, based on a known distribution of a continuous predictor were simulated using 1000 RDS samples from each population. Weighted and unweighted binomial and Poisson general linear models, with and without various clustering controls and standard error adjustments were modelled for each sample and evaluated with respect to validity, bias and coverage rate. Population prevalence was also estimated. RESULTS: In the regression analysis, the unweighted log-link (Poisson) models maintained the nominal type-I error rate across all populations. Bias was substantial and type-I error rates unacceptably high for weighted binomial regression. Coverage rates for the estimation of prevalence were highest using RDS-weighted logistic regression, except at low prevalence (10%) where unweighted models are recommended. CONCLUSIONS: Caution is warranted when undertaking regression analysis of RDS data. Even when reported degree is accurate, low reported degree can unduly influence regression estimates. Unweighted Poisson regression is therefore recommended.
format	Online Article Text
id	pubmed-6819607
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-68196072019-10-31 Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study Avery, Lisa Rotondi, Nooshin McKnight, Constance Firestone, Michelle Smylie, Janet Rotondi, Michael BMC Med Res Methodol Research Article BACKGROUND: It is unclear whether weighted or unweighted regression is preferred in the analysis of data derived from respondent driven sampling. Our objective was to evaluate the validity of various regression models, with and without weights and with various controls for clustering in the estimation of the risk of group membership from data collected using respondent-driven sampling (RDS). METHODS: Twelve networked populations, with varying levels of homophily and prevalence, based on a known distribution of a continuous predictor were simulated using 1000 RDS samples from each population. Weighted and unweighted binomial and Poisson general linear models, with and without various clustering controls and standard error adjustments were modelled for each sample and evaluated with respect to validity, bias and coverage rate. Population prevalence was also estimated. RESULTS: In the regression analysis, the unweighted log-link (Poisson) models maintained the nominal type-I error rate across all populations. Bias was substantial and type-I error rates unacceptably high for weighted binomial regression. Coverage rates for the estimation of prevalence were highest using RDS-weighted logistic regression, except at low prevalence (10%) where unweighted models are recommended. CONCLUSIONS: Caution is warranted when undertaking regression analysis of RDS data. Even when reported degree is accurate, low reported degree can unduly influence regression estimates. Unweighted Poisson regression is therefore recommended. BioMed Central 2019-10-29 /pmc/articles/PMC6819607/ /pubmed/31664912 http://dx.doi.org/10.1186/s12874-019-0842-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Avery, Lisa Rotondi, Nooshin McKnight, Constance Firestone, Michelle Smylie, Janet Rotondi, Michael Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
title	Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
title_full	Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
title_fullStr	Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
title_full_unstemmed	Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
title_short	Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
title_sort	unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819607/ https://www.ncbi.nlm.nih.gov/pubmed/31664912 http://dx.doi.org/10.1186/s12874-019-0842-5
work_keys_str_mv	AT averylisa unweightedregressionmodelsperformbetterthanweightedregressiontechniquesforrespondentdrivensamplingdataresultsfromasimulationstudy AT rotondinooshin unweightedregressionmodelsperformbetterthanweightedregressiontechniquesforrespondentdrivensamplingdataresultsfromasimulationstudy AT mcknightconstance unweightedregressionmodelsperformbetterthanweightedregressiontechniquesforrespondentdrivensamplingdataresultsfromasimulationstudy AT firestonemichelle unweightedregressionmodelsperformbetterthanweightedregressiontechniquesforrespondentdrivensamplingdataresultsfromasimulationstudy AT smyliejanet unweightedregressionmodelsperformbetterthanweightedregressiontechniquesforrespondentdrivensamplingdataresultsfromasimulationstudy AT rotondimichael unweightedregressionmodelsperformbetterthanweightedregressiontechniquesforrespondentdrivensamplingdataresultsfromasimulationstudy

Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study

Ejemplares similares