Cargando…

Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study

BACKGROUNDS: We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care. METHODS: Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used....

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Wenjuan, Rudd, Anthony G., Wang, Yanzhong, Curcin, Vasa, Wolfe, Charles D., Peek, Niels, Bray, Benjamin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9137068/
https://www.ncbi.nlm.nih.gov/pubmed/35624434
http://dx.doi.org/10.1186/s12883-022-02722-1
Descripción
Sumario:BACKGROUNDS: We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care. METHODS: Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used. Models were developed using XGBoost, Logistic Regression (LR), LR with elastic net with/without interaction terms using 80% randomly selected admissions from 2013 to 2018, validated on the 20% remaining admissions, and temporally validated on 2019 admissions. The models were developed with 30 variables. A reference model was developed using LR and 4 variables. Performances of all models was evaluated in terms of discrimination, calibration, reclassification, Brier scores and Decision-curves. RESULTS: In total, 488,497 stroke patients with a 12.3% 30-day mortality rate were included in the analysis. In 2019 temporal validation set, XGBoost model obtained the lowest Brier score (0.069 (95% CI: 0.068–0.071)) and the highest area under the ROC curve (AUC) (0.895 (95% CI: 0.891–0.900)) which outperformed LR reference model by 0.04 AUC (p < 0.001) and LR with elastic net and interaction term model by 0.003 AUC (p < 0.001). All models were perfectly calibrated for low (< 5%) and moderate risk groups (5–15%) and ≈1% underestimation for high-risk groups (> 15%). The XGBoost model reclassified 1648 (8.1%) low-risk cases by the LR reference model as being moderate or high-risk and gained the most net benefit in decision curve analysis. CONCLUSIONS: All models with 30 variables are potentially useful as benchmarking models in stroke-care quality improvement with ML slightly outperforming others. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12883-022-02722-1.