Cargando…

Interpretable machine learning for predicting 28-day all-cause in-hospital mortality for hypertensive ischemic or hemorrhagic stroke patients in the ICU: a multi-center retrospective cohort study with internal and external cross-validation

BACKGROUND: Timely and accurate outcome prediction plays a critical role in guiding clinical decisions for hypertensive ischemic or hemorrhagic stroke patients admitted to the ICU. However, interpreting and translating the predictive models into clinical applications are as important as the predicti...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Jian, Chen, Huaqiao, Deng, Jiewen, Liu, Xiaozhu, Shu, Tingting, Yin, Chengliang, Duan, Minjie, Fu, Li, Wang, Kai, Zeng, Song
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10443100/
https://www.ncbi.nlm.nih.gov/pubmed/37614971
http://dx.doi.org/10.3389/fneur.2023.1185447
Descripción
Sumario:BACKGROUND: Timely and accurate outcome prediction plays a critical role in guiding clinical decisions for hypertensive ischemic or hemorrhagic stroke patients admitted to the ICU. However, interpreting and translating the predictive models into clinical applications are as important as the prediction itself. This study aimed to develop an interpretable machine learning (IML) model that accurately predicts 28-day all-cause mortality in hypertensive ischemic or hemorrhagic stroke patients. METHODS: A total of 4,274 hypertensive ischemic or hemorrhagic stroke patients admitted to the ICU in the USA from multicenter cohorts were included in this study to develop and validate the IML model. Five machine learning (ML) models were developed, including artificial neural network (ANN), gradient boosting machine (GBM), eXtreme Gradient Boosting (XGBoost), logistic regression (LR), and support vector machine (SVM), to predict mortality using the MIMIC-IV and eICU-CRD database in the USA. Feature selection was performed using the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm. Model performance was evaluated based on the area under the curve (AUC), accuracy, positive predictive value (PPV), and negative predictive value (NPV). The ML model with the best predictive performance was selected for interpretability analysis. Finally, the SHapley Additive exPlanations (SHAP) method was employed to evaluate the risk of all-cause in-hospital mortality among hypertensive ischemic or hemorrhagic stroke patients admitted to the ICU. RESULTS: The XGBoost model demonstrated the best predictive performance, with the AUC values of 0.822, 0.739, and 0.700 in the training, test, and external cohorts, respectively. The analysis of feature importance revealed that age, ethnicity, white blood cell (WBC), hyperlipidemia, mean corpuscular volume (MCV), glucose, pulse oximeter oxygen saturation (SpO(2)), serum calcium, red blood cell distribution width (RDW), blood urea nitrogen (BUN), and bicarbonate were the 11 most important features. The SHAP plots were employed to interpret the XGBoost model. CONCLUSIONS: The XGBoost model accurately predicted 28-day all-cause in-hospital mortality among hypertensive ischemic or hemorrhagic stroke patients admitted to the ICU. The SHAP method can provide explicit explanations of personalized risk prediction, which can aid physicians in understanding the model.