Cargando…
Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
BACKGROUND: Two characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were de...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523642/ https://www.ncbi.nlm.nih.gov/pubmed/37752579 http://dx.doi.org/10.1186/s13063-023-07648-8 |
_version_ | 1785110603179753472 |
---|---|
author | Lee, Kyung Hyun Pedroza, Claudia Avritscher, Elenir B. C. Mosquera, Ricardo A. Tyson, Jon E. |
author_facet | Lee, Kyung Hyun Pedroza, Claudia Avritscher, Elenir B. C. Mosquera, Ricardo A. Tyson, Jon E. |
author_sort | Lee, Kyung Hyun |
collection | PubMed |
description | BACKGROUND: Two characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were devised to analyze this type of data. However, the performance of zero-inflated regression models or the properties of data best suited for these analyses have not been thoroughly investigated. METHODS: We conducted a simulation study to evaluate the performance of two generalized linear models, negative binomial and zero-inflated negative binomial, for analyzing zero-inflated count data. Simulation scenarios assumed a randomized controlled trial design and varied the true underlying distribution, sample size, and rate of zero inflation. We compared the models in terms of bias, mean squared error, and coverage. Additionally, we used logistic regression to determine which data properties are most important for predicting the best-fitting model. RESULTS: We first found that, regardless of the rate of zero inflation, there was little difference between the conventional negative binomial and its zero-inflated counterpart in terms of bias of the marginal treatment group coefficient. Second, even when the outcome was simulated from a zero-inflated distribution, a negative binomial model was favored above its ZI counterpart in terms of the Akaike Information Criterion. Third, the mean and skewness of the non-zero part of the data were stronger predictors of model preference than the percentage of zero counts. These results were not affected by the sample size, which ranged from 60 to 800. CONCLUSIONS: We recommend that the rate of zero inflation and overdispersion in the outcome should not be the sole and main justification for choosing zero-inflated regression models. Investigators should also consider other data characteristics when choosing a model for count data. In addition, if the performance of the NB and ZINB regression models is reasonably comparable even with ZI outcomes, we advocate the use of the NB regression model due to its clear and straightforward interpretation of the results. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13063-023-07648-8. |
format | Online Article Text |
id | pubmed-10523642 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-105236422023-09-28 Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial Lee, Kyung Hyun Pedroza, Claudia Avritscher, Elenir B. C. Mosquera, Ricardo A. Tyson, Jon E. Trials Research BACKGROUND: Two characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were devised to analyze this type of data. However, the performance of zero-inflated regression models or the properties of data best suited for these analyses have not been thoroughly investigated. METHODS: We conducted a simulation study to evaluate the performance of two generalized linear models, negative binomial and zero-inflated negative binomial, for analyzing zero-inflated count data. Simulation scenarios assumed a randomized controlled trial design and varied the true underlying distribution, sample size, and rate of zero inflation. We compared the models in terms of bias, mean squared error, and coverage. Additionally, we used logistic regression to determine which data properties are most important for predicting the best-fitting model. RESULTS: We first found that, regardless of the rate of zero inflation, there was little difference between the conventional negative binomial and its zero-inflated counterpart in terms of bias of the marginal treatment group coefficient. Second, even when the outcome was simulated from a zero-inflated distribution, a negative binomial model was favored above its ZI counterpart in terms of the Akaike Information Criterion. Third, the mean and skewness of the non-zero part of the data were stronger predictors of model preference than the percentage of zero counts. These results were not affected by the sample size, which ranged from 60 to 800. CONCLUSIONS: We recommend that the rate of zero inflation and overdispersion in the outcome should not be the sole and main justification for choosing zero-inflated regression models. Investigators should also consider other data characteristics when choosing a model for count data. In addition, if the performance of the NB and ZINB regression models is reasonably comparable even with ZI outcomes, we advocate the use of the NB regression model due to its clear and straightforward interpretation of the results. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13063-023-07648-8. BioMed Central 2023-09-27 /pmc/articles/PMC10523642/ /pubmed/37752579 http://dx.doi.org/10.1186/s13063-023-07648-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Lee, Kyung Hyun Pedroza, Claudia Avritscher, Elenir B. C. Mosquera, Ricardo A. Tyson, Jon E. Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
title | Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
title_full | Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
title_fullStr | Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
title_full_unstemmed | Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
title_short | Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
title_sort | evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523642/ https://www.ncbi.nlm.nih.gov/pubmed/37752579 http://dx.doi.org/10.1186/s13063-023-07648-8 |
work_keys_str_mv | AT leekyunghyun evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial AT pedrozaclaudia evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial AT avritscherelenirbc evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial AT mosqueraricardoa evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial AT tysonjone evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial |