Cargando…

A comparison of statistical methods for modeling count data with an application to hospital length of stay

BACKGROUND: Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Hospital LOS is often used as a measure of a post-medical procedure outcome, as a guide to the benefit of a treatment of interest, or as an important risk factor...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fernandez, Gustavo A., Vatcheva, Kristina P.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351158/ https://www.ncbi.nlm.nih.gov/pubmed/35927612 http://dx.doi.org/10.1186/s12874-022-01685-8

_version_	1784762380758024192
author	Fernandez, Gustavo A. Vatcheva, Kristina P.
author_facet	Fernandez, Gustavo A. Vatcheva, Kristina P.
author_sort	Fernandez, Gustavo A.
collection	PubMed
description	BACKGROUND: Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Hospital LOS is often used as a measure of a post-medical procedure outcome, as a guide to the benefit of a treatment of interest, or as an important risk factor for adverse events. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data can be treated as count data, with discrete and non-negative values, typically right skewed, and often exhibiting excessive zeros. In this study, we compared the performance of the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB) regression models using simulated and empirical data. METHODS: Data were generated under different simulation scenarios with varying sample sizes, proportions of zeros, and levels of overdispersion. Analysis of hospital LOS was conducted using empirical data from the Medical Information Mart for Intensive Care database. RESULTS: Results showed that Poisson and ZIP models performed poorly in overdispersed data. ZIP outperformed the rest of the regression models when the overdispersion is due to zero-inflation only. NB and ZINB regression models faced substantial convergence issues when incorrectly used to model equidispersed data. NB model provided the best fit in overdispersed data and outperformed the ZINB model in many simulation scenarios with combinations of zero-inflation and overdispersion, regardless of the sample size. In the empirical data analysis, we demonstrated that fitting incorrect models to overdispersed data leaded to incorrect regression coefficients estimates and overstated significance of some of the predictors. CONCLUSIONS: Based on this study, we recommend to the researchers that they consider the ZIP models for count data with zero-inflation only and NB models for overdispersed data or data with combinations of zero-inflation and overdispersion. If the researcher believes there are two different data generating mechanisms producing zeros, then the ZINB regression model may provide greater flexibility when modeling the zero-inflation and overdispersion.
format	Online Article Text
id	pubmed-9351158
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-93511582022-08-05 A comparison of statistical methods for modeling count data with an application to hospital length of stay Fernandez, Gustavo A. Vatcheva, Kristina P. BMC Med Res Methodol Research BACKGROUND: Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Hospital LOS is often used as a measure of a post-medical procedure outcome, as a guide to the benefit of a treatment of interest, or as an important risk factor for adverse events. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data can be treated as count data, with discrete and non-negative values, typically right skewed, and often exhibiting excessive zeros. In this study, we compared the performance of the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB) regression models using simulated and empirical data. METHODS: Data were generated under different simulation scenarios with varying sample sizes, proportions of zeros, and levels of overdispersion. Analysis of hospital LOS was conducted using empirical data from the Medical Information Mart for Intensive Care database. RESULTS: Results showed that Poisson and ZIP models performed poorly in overdispersed data. ZIP outperformed the rest of the regression models when the overdispersion is due to zero-inflation only. NB and ZINB regression models faced substantial convergence issues when incorrectly used to model equidispersed data. NB model provided the best fit in overdispersed data and outperformed the ZINB model in many simulation scenarios with combinations of zero-inflation and overdispersion, regardless of the sample size. In the empirical data analysis, we demonstrated that fitting incorrect models to overdispersed data leaded to incorrect regression coefficients estimates and overstated significance of some of the predictors. CONCLUSIONS: Based on this study, we recommend to the researchers that they consider the ZIP models for count data with zero-inflation only and NB models for overdispersed data or data with combinations of zero-inflation and overdispersion. If the researcher believes there are two different data generating mechanisms producing zeros, then the ZINB regression model may provide greater flexibility when modeling the zero-inflation and overdispersion. BioMed Central 2022-08-04 /pmc/articles/PMC9351158/ /pubmed/35927612 http://dx.doi.org/10.1186/s12874-022-01685-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Fernandez, Gustavo A. Vatcheva, Kristina P. A comparison of statistical methods for modeling count data with an application to hospital length of stay
title	A comparison of statistical methods for modeling count data with an application to hospital length of stay
title_full	A comparison of statistical methods for modeling count data with an application to hospital length of stay
title_fullStr	A comparison of statistical methods for modeling count data with an application to hospital length of stay
title_full_unstemmed	A comparison of statistical methods for modeling count data with an application to hospital length of stay
title_short	A comparison of statistical methods for modeling count data with an application to hospital length of stay
title_sort	comparison of statistical methods for modeling count data with an application to hospital length of stay
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351158/ https://www.ncbi.nlm.nih.gov/pubmed/35927612 http://dx.doi.org/10.1186/s12874-022-01685-8
work_keys_str_mv	AT fernandezgustavoa acomparisonofstatisticalmethodsformodelingcountdatawithanapplicationtohospitallengthofstay AT vatchevakristinap acomparisonofstatisticalmethodsformodelingcountdatawithanapplicationtohospitallengthofstay AT fernandezgustavoa comparisonofstatisticalmethodsformodelingcountdatawithanapplicationtohospitallengthofstay AT vatchevakristinap comparisonofstatisticalmethodsformodelingcountdatawithanapplicationtohospitallengthofstay

A comparison of statistical methods for modeling count data with an application to hospital length of stay

Ejemplares similares