Cargando…

Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population

BACKGROUND: Mild cognitive impairment (MCI) is considered a preclinical stage of Alzheimer’s disease (AD). People with MCI have a higher risk of developing dementia than healthy people. As one of the risk factors for MCI, stroke has been actively treated and intervened. Therefore, selecting the high...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Feng-Juan, Chen, Xie-Hui, Quan, Xiao-Qing, Wang, Li-Li, Wei, Xin-Yi, Zhu, Jia-Liang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308219/
https://www.ncbi.nlm.nih.gov/pubmed/37396650
http://dx.doi.org/10.3389/fnagi.2023.1180351
_version_ 1785066200484544512
author Yan, Feng-Juan
Chen, Xie-Hui
Quan, Xiao-Qing
Wang, Li-Li
Wei, Xin-Yi
Zhu, Jia-Liang
author_facet Yan, Feng-Juan
Chen, Xie-Hui
Quan, Xiao-Qing
Wang, Li-Li
Wei, Xin-Yi
Zhu, Jia-Liang
author_sort Yan, Feng-Juan
collection PubMed
description BACKGROUND: Mild cognitive impairment (MCI) is considered a preclinical stage of Alzheimer’s disease (AD). People with MCI have a higher risk of developing dementia than healthy people. As one of the risk factors for MCI, stroke has been actively treated and intervened. Therefore, selecting the high-risk population of stroke as the research object and discovering the risk factors of MCI as early as possible can prevent the occurrence of MCI more effectively. METHODS: The Boruta algorithm was used to screen variables, and eight machine learning models were established and evaluated. The best performing models were used to assess variable importance and build an online risk calculator. Shapley additive explanation is used to explain the model. RESULTS: A total of 199 patients were included in the study, 99 of whom were male. Transient ischemic attack (TIA), homocysteine, education, hematocrit (HCT), diabetes, hemoglobin, red blood cells (RBC), hypertension, prothrombin time (PT) were selected by Boruta algorithm. Logistic regression (AUC = 0.8595) was the best model for predicting MCI in high-risk groups of stroke, followed by elastic network (ENET) (AUC = 0.8312), multilayer perceptron (MLP) (AUC = 0.7908), extreme gradient boosting (XGBoost) (AUC = 0.7691), and support vector machine (SVM) (AUC = 0.7527), random forest (RF) (AUC = 0.7451), K-nearest neighbors (KNN) (AUC = 0.7380), decision tree (DT) (AUC = 0.6972). The importance of variables suggests that TIA, diabetes, education, and hypertension are the top four variables of importance. CONCLUSION: Transient ischemic attack (TIA), diabetes, education, and hypertension are the most important risk factors for MCI in high-risk groups of stroke, and early intervention should be performed to reduce the occurrence of MCI.
format Online
Article
Text
id pubmed-10308219
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-103082192023-06-30 Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population Yan, Feng-Juan Chen, Xie-Hui Quan, Xiao-Qing Wang, Li-Li Wei, Xin-Yi Zhu, Jia-Liang Front Aging Neurosci Neuroscience BACKGROUND: Mild cognitive impairment (MCI) is considered a preclinical stage of Alzheimer’s disease (AD). People with MCI have a higher risk of developing dementia than healthy people. As one of the risk factors for MCI, stroke has been actively treated and intervened. Therefore, selecting the high-risk population of stroke as the research object and discovering the risk factors of MCI as early as possible can prevent the occurrence of MCI more effectively. METHODS: The Boruta algorithm was used to screen variables, and eight machine learning models were established and evaluated. The best performing models were used to assess variable importance and build an online risk calculator. Shapley additive explanation is used to explain the model. RESULTS: A total of 199 patients were included in the study, 99 of whom were male. Transient ischemic attack (TIA), homocysteine, education, hematocrit (HCT), diabetes, hemoglobin, red blood cells (RBC), hypertension, prothrombin time (PT) were selected by Boruta algorithm. Logistic regression (AUC = 0.8595) was the best model for predicting MCI in high-risk groups of stroke, followed by elastic network (ENET) (AUC = 0.8312), multilayer perceptron (MLP) (AUC = 0.7908), extreme gradient boosting (XGBoost) (AUC = 0.7691), and support vector machine (SVM) (AUC = 0.7527), random forest (RF) (AUC = 0.7451), K-nearest neighbors (KNN) (AUC = 0.7380), decision tree (DT) (AUC = 0.6972). The importance of variables suggests that TIA, diabetes, education, and hypertension are the top four variables of importance. CONCLUSION: Transient ischemic attack (TIA), diabetes, education, and hypertension are the most important risk factors for MCI in high-risk groups of stroke, and early intervention should be performed to reduce the occurrence of MCI. Frontiers Media S.A. 2023-06-15 /pmc/articles/PMC10308219/ /pubmed/37396650 http://dx.doi.org/10.3389/fnagi.2023.1180351 Text en Copyright © 2023 Yan, Chen, Quan, Wang, Wei and Zhu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Yan, Feng-Juan
Chen, Xie-Hui
Quan, Xiao-Qing
Wang, Li-Li
Wei, Xin-Yi
Zhu, Jia-Liang
Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population
title Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population
title_full Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population
title_fullStr Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population
title_full_unstemmed Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population
title_short Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population
title_sort development and validation of an interpretable machine learning model—predicting mild cognitive impairment in a high-risk stroke population
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308219/
https://www.ncbi.nlm.nih.gov/pubmed/37396650
http://dx.doi.org/10.3389/fnagi.2023.1180351
work_keys_str_mv AT yanfengjuan developmentandvalidationofaninterpretablemachinelearningmodelpredictingmildcognitiveimpairmentinahighriskstrokepopulation
AT chenxiehui developmentandvalidationofaninterpretablemachinelearningmodelpredictingmildcognitiveimpairmentinahighriskstrokepopulation
AT quanxiaoqing developmentandvalidationofaninterpretablemachinelearningmodelpredictingmildcognitiveimpairmentinahighriskstrokepopulation
AT wanglili developmentandvalidationofaninterpretablemachinelearningmodelpredictingmildcognitiveimpairmentinahighriskstrokepopulation
AT weixinyi developmentandvalidationofaninterpretablemachinelearningmodelpredictingmildcognitiveimpairmentinahighriskstrokepopulation
AT zhujialiang developmentandvalidationofaninterpretablemachinelearningmodelpredictingmildcognitiveimpairmentinahighriskstrokepopulation