Cargando…

Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction

PURPOSE: Hospital readmission prediction uses historical patient visit data to train machine learning models to predict risk of patients being readmitted after the discharge. Data used to train models, such as patient demographics, disease types, localized distributions etc., play significant roles...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Shuwen, Zhu, Xingquan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439279/
https://www.ncbi.nlm.nih.gov/pubmed/36065327
http://dx.doi.org/10.1007/s13755-022-00195-7
_version_ 1784782020444946432
author Wang, Shuwen
Zhu, Xingquan
author_facet Wang, Shuwen
Zhu, Xingquan
author_sort Wang, Shuwen
collection PubMed
description PURPOSE: Hospital readmission prediction uses historical patient visit data to train machine learning models to predict risk of patients being readmitted after the discharge. Data used to train models, such as patient demographics, disease types, localized distributions etc., play significant roles in the model performance. To date, many methods exist for hospital readmission prediction, but answers to some important questions still remain open. For example, how will demographics, such as gender, age, geographic, impact on readmission prediction? Do patients suffering from different diseases vary significantly in their readmission rates? What are the nationwide hospital admission data characteristics? and how do hospital speciality, ownership, and locations impact on their readmission rates? In this study, we carry systematic investigations to answer the above questions, and propose a predictive modeling framework to predict disease-specific 30-day hospital readmission. METHODS: We first implement statistics analysis by using National Readmission Databases (NRD) with over 15 million hospital visits. After that, we create features and disease-specific readmission datasets. An ensemble learning framework is proposed to conduct hospital readmission prediction and Friedman test and Nemenyi post-hoc test is used to validate our proposed method. RESULTS: Using National Readmission Databases (NRD), with over 15 million hospital visits, as our testbed, we summarize nationwide patient admission data statistics, in related to demographic, disease types, and hospital factors. We use feature engineering to design 526 representative features to model each patient visit. Our studies found that readmission rates vary significantly from diseases to diseases. For six diseases studied in our research, their readmission rates vary from 1.832 (Pneumonia) to 8.761% (Diabetes). Using random sampling and voting approaches, our study shows that soft voting outperforms hard voting on majority results, especially for AUC and balanced accuracy which are the main measures for imbalanced data. Random under sampling using 1.1:1 for negative:positive ratio achieves the best performance for AUC, balanced accuracy, and F1-score. CONCLUSION: This paper carries out systematic studies to understand US nationwide hospital readmission data statistics, and further designs a machine learning framework for disease-specific 30-day hospital readmission prediction. Our study shows that hospital readmission rates vary significantly with respect to different disease types, gender, age groups, any other factors. Gradient boosting achieves the best performance for disease specific hospital readmission prediction.
format Online
Article
Text
id pubmed-9439279
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-94392792022-09-06 Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction Wang, Shuwen Zhu, Xingquan Health Inf Sci Syst Research PURPOSE: Hospital readmission prediction uses historical patient visit data to train machine learning models to predict risk of patients being readmitted after the discharge. Data used to train models, such as patient demographics, disease types, localized distributions etc., play significant roles in the model performance. To date, many methods exist for hospital readmission prediction, but answers to some important questions still remain open. For example, how will demographics, such as gender, age, geographic, impact on readmission prediction? Do patients suffering from different diseases vary significantly in their readmission rates? What are the nationwide hospital admission data characteristics? and how do hospital speciality, ownership, and locations impact on their readmission rates? In this study, we carry systematic investigations to answer the above questions, and propose a predictive modeling framework to predict disease-specific 30-day hospital readmission. METHODS: We first implement statistics analysis by using National Readmission Databases (NRD) with over 15 million hospital visits. After that, we create features and disease-specific readmission datasets. An ensemble learning framework is proposed to conduct hospital readmission prediction and Friedman test and Nemenyi post-hoc test is used to validate our proposed method. RESULTS: Using National Readmission Databases (NRD), with over 15 million hospital visits, as our testbed, we summarize nationwide patient admission data statistics, in related to demographic, disease types, and hospital factors. We use feature engineering to design 526 representative features to model each patient visit. Our studies found that readmission rates vary significantly from diseases to diseases. For six diseases studied in our research, their readmission rates vary from 1.832 (Pneumonia) to 8.761% (Diabetes). Using random sampling and voting approaches, our study shows that soft voting outperforms hard voting on majority results, especially for AUC and balanced accuracy which are the main measures for imbalanced data. Random under sampling using 1.1:1 for negative:positive ratio achieves the best performance for AUC, balanced accuracy, and F1-score. CONCLUSION: This paper carries out systematic studies to understand US nationwide hospital readmission data statistics, and further designs a machine learning framework for disease-specific 30-day hospital readmission prediction. Our study shows that hospital readmission rates vary significantly with respect to different disease types, gender, age groups, any other factors. Gradient boosting achieves the best performance for disease specific hospital readmission prediction. Springer International Publishing 2022-09-02 /pmc/articles/PMC9439279/ /pubmed/36065327 http://dx.doi.org/10.1007/s13755-022-00195-7 Text en © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
spellingShingle Research
Wang, Shuwen
Zhu, Xingquan
Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
title Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
title_full Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
title_fullStr Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
title_full_unstemmed Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
title_short Nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
title_sort nationwide hospital admission data statistics and disease-specific 30-day readmission prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439279/
https://www.ncbi.nlm.nih.gov/pubmed/36065327
http://dx.doi.org/10.1007/s13755-022-00195-7
work_keys_str_mv AT wangshuwen nationwidehospitaladmissiondatastatisticsanddiseasespecific30dayreadmissionprediction
AT zhuxingquan nationwidehospitaladmissiondatastatisticsanddiseasespecific30dayreadmissionprediction