Cargando…

Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial

BACKGROUND: Two characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were de...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Kyung Hyun, Pedroza, Claudia, Avritscher, Elenir B. C., Mosquera, Ricardo A., Tyson, Jon E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523642/
https://www.ncbi.nlm.nih.gov/pubmed/37752579
http://dx.doi.org/10.1186/s13063-023-07648-8
_version_ 1785110603179753472
author Lee, Kyung Hyun
Pedroza, Claudia
Avritscher, Elenir B. C.
Mosquera, Ricardo A.
Tyson, Jon E.
author_facet Lee, Kyung Hyun
Pedroza, Claudia
Avritscher, Elenir B. C.
Mosquera, Ricardo A.
Tyson, Jon E.
author_sort Lee, Kyung Hyun
collection PubMed
description BACKGROUND: Two characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were devised to analyze this type of data. However, the performance of zero-inflated regression models or the properties of data best suited for these analyses have not been thoroughly investigated. METHODS: We conducted a simulation study to evaluate the performance of two generalized linear models, negative binomial and zero-inflated negative binomial, for analyzing zero-inflated count data. Simulation scenarios assumed a randomized controlled trial design and varied the true underlying distribution, sample size, and rate of zero inflation. We compared the models in terms of bias, mean squared error, and coverage. Additionally, we used logistic regression to determine which data properties are most important for predicting the best-fitting model. RESULTS: We first found that, regardless of the rate of zero inflation, there was little difference between the conventional negative binomial and its zero-inflated counterpart in terms of bias of the marginal treatment group coefficient. Second, even when the outcome was simulated from a zero-inflated distribution, a negative binomial model was favored above its ZI counterpart in terms of the Akaike Information Criterion. Third, the mean and skewness of the non-zero part of the data were stronger predictors of model preference than the percentage of zero counts. These results were not affected by the sample size, which ranged from 60 to 800. CONCLUSIONS: We recommend that the rate of zero inflation and overdispersion in the outcome should not be the sole and main justification for choosing zero-inflated regression models. Investigators should also consider other data characteristics when choosing a model for count data. In addition, if the performance of the NB and ZINB regression models is reasonably comparable even with ZI outcomes, we advocate the use of the NB regression model due to its clear and straightforward interpretation of the results. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13063-023-07648-8.
format Online
Article
Text
id pubmed-10523642
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105236422023-09-28 Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial Lee, Kyung Hyun Pedroza, Claudia Avritscher, Elenir B. C. Mosquera, Ricardo A. Tyson, Jon E. Trials Research BACKGROUND: Two characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were devised to analyze this type of data. However, the performance of zero-inflated regression models or the properties of data best suited for these analyses have not been thoroughly investigated. METHODS: We conducted a simulation study to evaluate the performance of two generalized linear models, negative binomial and zero-inflated negative binomial, for analyzing zero-inflated count data. Simulation scenarios assumed a randomized controlled trial design and varied the true underlying distribution, sample size, and rate of zero inflation. We compared the models in terms of bias, mean squared error, and coverage. Additionally, we used logistic regression to determine which data properties are most important for predicting the best-fitting model. RESULTS: We first found that, regardless of the rate of zero inflation, there was little difference between the conventional negative binomial and its zero-inflated counterpart in terms of bias of the marginal treatment group coefficient. Second, even when the outcome was simulated from a zero-inflated distribution, a negative binomial model was favored above its ZI counterpart in terms of the Akaike Information Criterion. Third, the mean and skewness of the non-zero part of the data were stronger predictors of model preference than the percentage of zero counts. These results were not affected by the sample size, which ranged from 60 to 800. CONCLUSIONS: We recommend that the rate of zero inflation and overdispersion in the outcome should not be the sole and main justification for choosing zero-inflated regression models. Investigators should also consider other data characteristics when choosing a model for count data. In addition, if the performance of the NB and ZINB regression models is reasonably comparable even with ZI outcomes, we advocate the use of the NB regression model due to its clear and straightforward interpretation of the results. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13063-023-07648-8. BioMed Central 2023-09-27 /pmc/articles/PMC10523642/ /pubmed/37752579 http://dx.doi.org/10.1186/s13063-023-07648-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Lee, Kyung Hyun
Pedroza, Claudia
Avritscher, Elenir B. C.
Mosquera, Ricardo A.
Tyson, Jon E.
Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
title Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
title_full Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
title_fullStr Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
title_full_unstemmed Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
title_short Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
title_sort evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523642/
https://www.ncbi.nlm.nih.gov/pubmed/37752579
http://dx.doi.org/10.1186/s13063-023-07648-8
work_keys_str_mv AT leekyunghyun evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial
AT pedrozaclaudia evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial
AT avritscherelenirbc evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial
AT mosqueraricardoa evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial
AT tysonjone evaluationofnegativebinomialandzeroinflatednegativebinomialmodelsfortheanalysisofzeroinflatedcountdataapplicationtothetelemedicineforchildrenwithmedicalcomplexitytrial