Cargando…

A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros

Binary outcomes are extremely common in biomedical research. Despite its popularity, binomial regression often fails to model this kind of data accurately due to the overdispersion problem. Many alternatives can be found in the literature, the beta‐binomial (BB) regression model being one of the mos...

Descripción completa

Detalles Bibliográficos
Autores principales: Ascari, Roberto, Migliorati, Sonia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8360060/
https://www.ncbi.nlm.nih.gov/pubmed/33960503
http://dx.doi.org/10.1002/sim.9005
_version_ 1783737667418062848
author Ascari, Roberto
Migliorati, Sonia
author_facet Ascari, Roberto
Migliorati, Sonia
author_sort Ascari, Roberto
collection PubMed
description Binary outcomes are extremely common in biomedical research. Despite its popularity, binomial regression often fails to model this kind of data accurately due to the overdispersion problem. Many alternatives can be found in the literature, the beta‐binomial (BB) regression model being one of the most popular. The additional parameter of this model enables a better fit to overdispersed data. It also exhibits an attractive interpretation in terms of the intraclass correlation coefficient. Nonetheless, in many real data applications, a single additional parameter cannot handle the entire excess of variability. In this study, we propose a new finite mixture distribution with BB components, namely, the flexible beta‐binomial (FBB), which is characterized by a richer parameterization. This allows us to enhance the variance structure to account for multiple causes of overdispersion while also preserving the intraclass correlation interpretation. The novel regression model, based on the FBB distribution, exploits the flexibility and large variety of the distribution's possible shapes (which includes bimodality and various tail behaviors). Thus, it succeeds in accounting for several (possibly concomitant) sources of overdispersion stemming from the presence of latent groups in the population, outliers, and excessive zero observations. Adopting a Bayesian approach to inference, we perform an intensive simulation study that shows the superiority of the new regression model over that of the existing ones. Its better performance is also confirmed by three applications to real datasets extensively studied in the biomedical literature, namely, bacteria data, atomic bomb radiation data, and control mice data.
format Online
Article
Text
id pubmed-8360060
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-83600602021-08-17 A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros Ascari, Roberto Migliorati, Sonia Stat Med Research Articles Binary outcomes are extremely common in biomedical research. Despite its popularity, binomial regression often fails to model this kind of data accurately due to the overdispersion problem. Many alternatives can be found in the literature, the beta‐binomial (BB) regression model being one of the most popular. The additional parameter of this model enables a better fit to overdispersed data. It also exhibits an attractive interpretation in terms of the intraclass correlation coefficient. Nonetheless, in many real data applications, a single additional parameter cannot handle the entire excess of variability. In this study, we propose a new finite mixture distribution with BB components, namely, the flexible beta‐binomial (FBB), which is characterized by a richer parameterization. This allows us to enhance the variance structure to account for multiple causes of overdispersion while also preserving the intraclass correlation interpretation. The novel regression model, based on the FBB distribution, exploits the flexibility and large variety of the distribution's possible shapes (which includes bimodality and various tail behaviors). Thus, it succeeds in accounting for several (possibly concomitant) sources of overdispersion stemming from the presence of latent groups in the population, outliers, and excessive zero observations. Adopting a Bayesian approach to inference, we perform an intensive simulation study that shows the superiority of the new regression model over that of the existing ones. Its better performance is also confirmed by three applications to real datasets extensively studied in the biomedical literature, namely, bacteria data, atomic bomb radiation data, and control mice data. John Wiley and Sons Inc. 2021-05-07 2021-07-30 /pmc/articles/PMC8360060/ /pubmed/33960503 http://dx.doi.org/10.1002/sim.9005 Text en © 2021 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Ascari, Roberto
Migliorati, Sonia
A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
title A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
title_full A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
title_fullStr A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
title_full_unstemmed A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
title_short A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
title_sort new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8360060/
https://www.ncbi.nlm.nih.gov/pubmed/33960503
http://dx.doi.org/10.1002/sim.9005
work_keys_str_mv AT ascariroberto anewregressionmodelforoverdispersedbinomialdataaccountingforoutliersandanexcessofzeros
AT miglioratisonia anewregressionmodelforoverdispersedbinomialdataaccountingforoutliersandanexcessofzeros
AT ascariroberto newregressionmodelforoverdispersedbinomialdataaccountingforoutliersandanexcessofzeros
AT miglioratisonia newregressionmodelforoverdispersedbinomialdataaccountingforoutliersandanexcessofzeros