Cargando…

Comparison of ischemic stroke diagnosis models based on machine learning

BACKGROUND: The incidence, prevalence, and mortality of ischemic stroke (IS) continue to rise, resulting in a serious global disease burden. The prediction models have a great value in the early prediction and diagnosis of IS. METHODS: The R software was used to screen the differentially expressed g...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Wan-Xia, Wang, Fang-Fang, Pan, Yun-Yan, Xie, Jian-Qin, Lu, Ming-Hua, You, Chong-Ge
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Neurology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762505/ https://www.ncbi.nlm.nih.gov/pubmed/36545400 http://dx.doi.org/10.3389/fneur.2022.1014346

_version_	1784852876298813440
author	Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge
author_facet	Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge
author_sort	Yang, Wan-Xia
collection	PubMed
description	BACKGROUND: The incidence, prevalence, and mortality of ischemic stroke (IS) continue to rise, resulting in a serious global disease burden. The prediction models have a great value in the early prediction and diagnosis of IS. METHODS: The R software was used to screen the differentially expressed genes (DEGs) of IS and control samples in the datasets GSE16561, GSE58294, and GSE37587 and analyze DEGs for enrichment analysis. The feature genes of IS were obtained by several machine learning algorithms, including the least absolute shrinkage and selector operation (LASSO) logistic regression, the support vector machine-recursive feature elimination (SVM-RFE), and the Random Forest (RF). The IS diagnostic models were constructed based on transcriptomics by machine learning and artificial neural network (ANN). RESULTS: A total of 69 DEGs, mainly involved in immune and inflammatory responses, were identified. The pathways enriched in the IS group were complement and coagulation cascades, lysosome, PPAR signaling pathway, regulation of autophagy, and toll-like receptor signaling pathway. The feature genes selected by LASSO, SVM-RFE, and RF were 17, 10, and 12, respectively. The area under the curve (AUC) of the LASSO model in the training dataset, GSE22255, and GSE195442 was 0.969, 0.890, and 1.000. The AUC of the SVM-RFE model was 0.957, 0.805, and 1.000, respectively. The AUC of the RF model was 0.947, 0.935, and 1.000, respectively. The models have good sensitivity, specificity, and accuracy. The AUC of the LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 1.000, 0.995, and 0.997, respectively, in the training dataset. However, the AUC of LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 0.688, 0.605, and 0.619, respectively, in the GSE22255 dataset. The AUC of the LASSO+ANN and RF+ANN models was 0.740 and 0.630, respectively, in the GSE195442 dataset. In the training dataset, the sensitivity, specificity, and accuracy of the LASSO+ANN model were 1.000, 1.000, and 1.000, respectively; of the SVM-RFE+ANN model were 0.946, 0.982, and 0.964, respectively; and of the RF+ANN model were 0.964, 1.000, and 0.982, respectively. In the test datasets, the sensitivity was very satisfactory; however, the specificity and accuracy were not good. CONCLUSION: The LASSO, SVM-RFE, and RF models have good prediction abilities. However, the ANN model is efficient at classifying positive samples and is unsuitable at classifying negative samples.
format	Online Article Text
id	pubmed-9762505
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-97625052022-12-20 Comparison of ischemic stroke diagnosis models based on machine learning Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge Front Neurol Neurology BACKGROUND: The incidence, prevalence, and mortality of ischemic stroke (IS) continue to rise, resulting in a serious global disease burden. The prediction models have a great value in the early prediction and diagnosis of IS. METHODS: The R software was used to screen the differentially expressed genes (DEGs) of IS and control samples in the datasets GSE16561, GSE58294, and GSE37587 and analyze DEGs for enrichment analysis. The feature genes of IS were obtained by several machine learning algorithms, including the least absolute shrinkage and selector operation (LASSO) logistic regression, the support vector machine-recursive feature elimination (SVM-RFE), and the Random Forest (RF). The IS diagnostic models were constructed based on transcriptomics by machine learning and artificial neural network (ANN). RESULTS: A total of 69 DEGs, mainly involved in immune and inflammatory responses, were identified. The pathways enriched in the IS group were complement and coagulation cascades, lysosome, PPAR signaling pathway, regulation of autophagy, and toll-like receptor signaling pathway. The feature genes selected by LASSO, SVM-RFE, and RF were 17, 10, and 12, respectively. The area under the curve (AUC) of the LASSO model in the training dataset, GSE22255, and GSE195442 was 0.969, 0.890, and 1.000. The AUC of the SVM-RFE model was 0.957, 0.805, and 1.000, respectively. The AUC of the RF model was 0.947, 0.935, and 1.000, respectively. The models have good sensitivity, specificity, and accuracy. The AUC of the LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 1.000, 0.995, and 0.997, respectively, in the training dataset. However, the AUC of LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 0.688, 0.605, and 0.619, respectively, in the GSE22255 dataset. The AUC of the LASSO+ANN and RF+ANN models was 0.740 and 0.630, respectively, in the GSE195442 dataset. In the training dataset, the sensitivity, specificity, and accuracy of the LASSO+ANN model were 1.000, 1.000, and 1.000, respectively; of the SVM-RFE+ANN model were 0.946, 0.982, and 0.964, respectively; and of the RF+ANN model were 0.964, 1.000, and 0.982, respectively. In the test datasets, the sensitivity was very satisfactory; however, the specificity and accuracy were not good. CONCLUSION: The LASSO, SVM-RFE, and RF models have good prediction abilities. However, the ANN model is efficient at classifying positive samples and is unsuitable at classifying negative samples. Frontiers Media S.A. 2022-12-05 /pmc/articles/PMC9762505/ /pubmed/36545400 http://dx.doi.org/10.3389/fneur.2022.1014346 Text en Copyright © 2022 Yang, Wang, Pan, Xie, Lu and You. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neurology Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge Comparison of ischemic stroke diagnosis models based on machine learning
title	Comparison of ischemic stroke diagnosis models based on machine learning
title_full	Comparison of ischemic stroke diagnosis models based on machine learning
title_fullStr	Comparison of ischemic stroke diagnosis models based on machine learning
title_full_unstemmed	Comparison of ischemic stroke diagnosis models based on machine learning
title_short	Comparison of ischemic stroke diagnosis models based on machine learning
title_sort	comparison of ischemic stroke diagnosis models based on machine learning
topic	Neurology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762505/ https://www.ncbi.nlm.nih.gov/pubmed/36545400 http://dx.doi.org/10.3389/fneur.2022.1014346
work_keys_str_mv	AT yangwanxia comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT wangfangfang comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT panyunyan comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT xiejianqin comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT luminghua comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT youchongge comparisonofischemicstrokediagnosismodelsbasedonmachinelearning

Comparison of ischemic stroke diagnosis models based on machine learning

Ejemplares similares