Cargando…

Prediction of Long-Term Stroke Recurrence Using Machine Learning Models

Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether perf...

Descripción completa

Detalles Bibliográficos
Autores principales: Abedi, Vida, Avula, Venkatesh, Chaudhary, Durgesh, Shahjouei, Shima, Khan, Ayesha, Griessenauer, Christoph J, Li, Jiang, Zand, Ramin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8003970/
https://www.ncbi.nlm.nih.gov/pubmed/33804724
http://dx.doi.org/10.3390/jcm10061286
_version_ 1783671814844579840
author Abedi, Vida
Avula, Venkatesh
Chaudhary, Durgesh
Shahjouei, Shima
Khan, Ayesha
Griessenauer, Christoph J
Li, Jiang
Zand, Ramin
author_facet Abedi, Vida
Avula, Venkatesh
Chaudhary, Durgesh
Shahjouei, Shima
Khan, Ayesha
Griessenauer, Christoph J
Li, Jiang
Zand, Ramin
author_sort Abedi, Vida
collection PubMed
description Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized. Methods: We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies. Results: We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies. Conclusion: All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention.
format Online
Article
Text
id pubmed-8003970
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80039702021-03-28 Prediction of Long-Term Stroke Recurrence Using Machine Learning Models Abedi, Vida Avula, Venkatesh Chaudhary, Durgesh Shahjouei, Shima Khan, Ayesha Griessenauer, Christoph J Li, Jiang Zand, Ramin J Clin Med Article Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized. Methods: We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies. Results: We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies. Conclusion: All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention. MDPI 2021-03-20 /pmc/articles/PMC8003970/ /pubmed/33804724 http://dx.doi.org/10.3390/jcm10061286 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Abedi, Vida
Avula, Venkatesh
Chaudhary, Durgesh
Shahjouei, Shima
Khan, Ayesha
Griessenauer, Christoph J
Li, Jiang
Zand, Ramin
Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
title Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
title_full Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
title_fullStr Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
title_full_unstemmed Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
title_short Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
title_sort prediction of long-term stroke recurrence using machine learning models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8003970/
https://www.ncbi.nlm.nih.gov/pubmed/33804724
http://dx.doi.org/10.3390/jcm10061286
work_keys_str_mv AT abedivida predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT avulavenkatesh predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT chaudharydurgesh predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT shahjoueishima predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT khanayesha predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT griessenauerchristophj predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT lijiang predictionoflongtermstrokerecurrenceusingmachinelearningmodels
AT zandramin predictionoflongtermstrokerecurrenceusingmachinelearningmodels