Cargando…

A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design

BACKGROUND: In biomedical research, response variables are often encountered which have bounded support on the open unit interval - (0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies...

Descripción completa

Detalles Bibliográficos
Autores principales: Meaney, Christopher, Moineddin, Rahim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3999882/
https://www.ncbi.nlm.nih.gov/pubmed/24461057
http://dx.doi.org/10.1186/1471-2288-14-14
_version_ 1782313554469191680
author Meaney, Christopher
Moineddin, Rahim
author_facet Meaney, Christopher
Moineddin, Rahim
author_sort Meaney, Christopher
collection PubMed
description BACKGROUND: In biomedical research, response variables are often encountered which have bounded support on the open unit interval - (0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. METHODS: In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. RESULTS: If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N(0) = N(1) = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the response data are generated from a discrete multinomial distribution with support on (0,1). CONCLUSIONS: The linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise.
format Online
Article
Text
id pubmed-3999882
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39998822014-05-08 A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design Meaney, Christopher Moineddin, Rahim BMC Med Res Methodol Research Article BACKGROUND: In biomedical research, response variables are often encountered which have bounded support on the open unit interval - (0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. METHODS: In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. RESULTS: If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N(0) = N(1) = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the response data are generated from a discrete multinomial distribution with support on (0,1). CONCLUSIONS: The linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise. BioMed Central 2014-01-24 /pmc/articles/PMC3999882/ /pubmed/24461057 http://dx.doi.org/10.1186/1471-2288-14-14 Text en Copyright © 2014 Meaney and Moineddin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Meaney, Christopher
Moineddin, Rahim
A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
title A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
title_full A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
title_fullStr A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
title_full_unstemmed A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
title_short A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
title_sort monte carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3999882/
https://www.ncbi.nlm.nih.gov/pubmed/24461057
http://dx.doi.org/10.1186/1471-2288-14-14
work_keys_str_mv AT meaneychristopher amontecarlosimulationstudycomparinglinearregressionbetaregressionvariabledispersionbetaregressionandfractionallogitregressionatrecoveringaveragedifferencemeasuresinatwosampledesign
AT moineddinrahim amontecarlosimulationstudycomparinglinearregressionbetaregressionvariabledispersionbetaregressionandfractionallogitregressionatrecoveringaveragedifferencemeasuresinatwosampledesign
AT meaneychristopher montecarlosimulationstudycomparinglinearregressionbetaregressionvariabledispersionbetaregressionandfractionallogitregressionatrecoveringaveragedifferencemeasuresinatwosampledesign
AT moineddinrahim montecarlosimulationstudycomparinglinearregressionbetaregressionvariabledispersionbetaregressionandfractionallogitregressionatrecoveringaveragedifferencemeasuresinatwosampledesign