Cargando…

Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results

BACKGROUND: There is little consensus on how to sample hospitalizations and analyze multiple variables to model readmission risk. The purpose of this study was to compare readmission rates and the accuracy of predictive models based on different sampling and multivariable modeling approaches. METHOD...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Huaqing, Tanner, Samuel, Golden, Sherita H., Fisher, Susan G., Rubin, Daniel J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7687737/
https://www.ncbi.nlm.nih.gov/pubmed/33238884
http://dx.doi.org/10.1186/s12874-020-01162-0
_version_ 1783613585364090880
author Zhao, Huaqing
Tanner, Samuel
Golden, Sherita H.
Fisher, Susan G.
Rubin, Daniel J.
author_facet Zhao, Huaqing
Tanner, Samuel
Golden, Sherita H.
Fisher, Susan G.
Rubin, Daniel J.
author_sort Zhao, Huaqing
collection PubMed
description BACKGROUND: There is little consensus on how to sample hospitalizations and analyze multiple variables to model readmission risk. The purpose of this study was to compare readmission rates and the accuracy of predictive models based on different sampling and multivariable modeling approaches. METHODS: We conducted a retrospective cohort study of 17,284 adult diabetes patients with 44,203 discharges from an urban academic medical center between 1/1/2004 and 12/31/2012. Models for all-cause 30-day readmission were developed by four strategies: logistic regression using the first discharge per patient (LR-first), logistic regression using all discharges (LR-all), generalized estimating equations (GEE) using all discharges, and cluster-weighted (CWGEE) using all discharges. Multiple sets of models were developed and internally validated across a range of sample sizes. RESULTS: The readmission rate was 10.2% among first discharges and 20.3% among all discharges, revealing that sampling only first discharges underestimates a population’s readmission rate. Number of discharges was highly correlated with number of readmissions (r = 0.87, P < 0.001). Accounting for clustering with GEE and CWGEE yielded more conservative estimates of model performance than LR-all. LR-first produced falsely optimistic Brier scores. Model performance was unstable below samples of 6000–8000 discharges and stable in larger samples. GEE and CWGEE performed better in larger samples than in smaller samples. CONCLUSIONS: Hospital readmission risk models should be based on all discharges as opposed to just the first discharge per patient and utilize methods that account for clustered data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-020-01162-0.
format Online
Article
Text
id pubmed-7687737
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76877372020-11-30 Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results Zhao, Huaqing Tanner, Samuel Golden, Sherita H. Fisher, Susan G. Rubin, Daniel J. BMC Med Res Methodol Research Article BACKGROUND: There is little consensus on how to sample hospitalizations and analyze multiple variables to model readmission risk. The purpose of this study was to compare readmission rates and the accuracy of predictive models based on different sampling and multivariable modeling approaches. METHODS: We conducted a retrospective cohort study of 17,284 adult diabetes patients with 44,203 discharges from an urban academic medical center between 1/1/2004 and 12/31/2012. Models for all-cause 30-day readmission were developed by four strategies: logistic regression using the first discharge per patient (LR-first), logistic regression using all discharges (LR-all), generalized estimating equations (GEE) using all discharges, and cluster-weighted (CWGEE) using all discharges. Multiple sets of models were developed and internally validated across a range of sample sizes. RESULTS: The readmission rate was 10.2% among first discharges and 20.3% among all discharges, revealing that sampling only first discharges underestimates a population’s readmission rate. Number of discharges was highly correlated with number of readmissions (r = 0.87, P < 0.001). Accounting for clustering with GEE and CWGEE yielded more conservative estimates of model performance than LR-all. LR-first produced falsely optimistic Brier scores. Model performance was unstable below samples of 6000–8000 discharges and stable in larger samples. GEE and CWGEE performed better in larger samples than in smaller samples. CONCLUSIONS: Hospital readmission risk models should be based on all discharges as opposed to just the first discharge per patient and utilize methods that account for clustered data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-020-01162-0. BioMed Central 2020-11-25 /pmc/articles/PMC7687737/ /pubmed/33238884 http://dx.doi.org/10.1186/s12874-020-01162-0 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Zhao, Huaqing
Tanner, Samuel
Golden, Sherita H.
Fisher, Susan G.
Rubin, Daniel J.
Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
title Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
title_full Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
title_fullStr Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
title_full_unstemmed Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
title_short Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
title_sort common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7687737/
https://www.ncbi.nlm.nih.gov/pubmed/33238884
http://dx.doi.org/10.1186/s12874-020-01162-0
work_keys_str_mv AT zhaohuaqing commonsamplingandmodelingapproachestoanalyzingreadmissionriskthatignoreclusteringproducemisleadingresults
AT tannersamuel commonsamplingandmodelingapproachestoanalyzingreadmissionriskthatignoreclusteringproducemisleadingresults
AT goldensheritah commonsamplingandmodelingapproachestoanalyzingreadmissionriskthatignoreclusteringproducemisleadingresults
AT fishersusang commonsamplingandmodelingapproachestoanalyzingreadmissionriskthatignoreclusteringproducemisleadingresults
AT rubindanielj commonsamplingandmodelingapproachestoanalyzingreadmissionriskthatignoreclusteringproducemisleadingresults