Cargando…
Comparison of ischemic stroke diagnosis models based on machine learning
BACKGROUND: The incidence, prevalence, and mortality of ischemic stroke (IS) continue to rise, resulting in a serious global disease burden. The prediction models have a great value in the early prediction and diagnosis of IS. METHODS: The R software was used to screen the differentially expressed g...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762505/ https://www.ncbi.nlm.nih.gov/pubmed/36545400 http://dx.doi.org/10.3389/fneur.2022.1014346 |
_version_ | 1784852876298813440 |
---|---|
author | Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge |
author_facet | Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge |
author_sort | Yang, Wan-Xia |
collection | PubMed |
description | BACKGROUND: The incidence, prevalence, and mortality of ischemic stroke (IS) continue to rise, resulting in a serious global disease burden. The prediction models have a great value in the early prediction and diagnosis of IS. METHODS: The R software was used to screen the differentially expressed genes (DEGs) of IS and control samples in the datasets GSE16561, GSE58294, and GSE37587 and analyze DEGs for enrichment analysis. The feature genes of IS were obtained by several machine learning algorithms, including the least absolute shrinkage and selector operation (LASSO) logistic regression, the support vector machine-recursive feature elimination (SVM-RFE), and the Random Forest (RF). The IS diagnostic models were constructed based on transcriptomics by machine learning and artificial neural network (ANN). RESULTS: A total of 69 DEGs, mainly involved in immune and inflammatory responses, were identified. The pathways enriched in the IS group were complement and coagulation cascades, lysosome, PPAR signaling pathway, regulation of autophagy, and toll-like receptor signaling pathway. The feature genes selected by LASSO, SVM-RFE, and RF were 17, 10, and 12, respectively. The area under the curve (AUC) of the LASSO model in the training dataset, GSE22255, and GSE195442 was 0.969, 0.890, and 1.000. The AUC of the SVM-RFE model was 0.957, 0.805, and 1.000, respectively. The AUC of the RF model was 0.947, 0.935, and 1.000, respectively. The models have good sensitivity, specificity, and accuracy. The AUC of the LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 1.000, 0.995, and 0.997, respectively, in the training dataset. However, the AUC of LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 0.688, 0.605, and 0.619, respectively, in the GSE22255 dataset. The AUC of the LASSO+ANN and RF+ANN models was 0.740 and 0.630, respectively, in the GSE195442 dataset. In the training dataset, the sensitivity, specificity, and accuracy of the LASSO+ANN model were 1.000, 1.000, and 1.000, respectively; of the SVM-RFE+ANN model were 0.946, 0.982, and 0.964, respectively; and of the RF+ANN model were 0.964, 1.000, and 0.982, respectively. In the test datasets, the sensitivity was very satisfactory; however, the specificity and accuracy were not good. CONCLUSION: The LASSO, SVM-RFE, and RF models have good prediction abilities. However, the ANN model is efficient at classifying positive samples and is unsuitable at classifying negative samples. |
format | Online Article Text |
id | pubmed-9762505 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-97625052022-12-20 Comparison of ischemic stroke diagnosis models based on machine learning Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge Front Neurol Neurology BACKGROUND: The incidence, prevalence, and mortality of ischemic stroke (IS) continue to rise, resulting in a serious global disease burden. The prediction models have a great value in the early prediction and diagnosis of IS. METHODS: The R software was used to screen the differentially expressed genes (DEGs) of IS and control samples in the datasets GSE16561, GSE58294, and GSE37587 and analyze DEGs for enrichment analysis. The feature genes of IS were obtained by several machine learning algorithms, including the least absolute shrinkage and selector operation (LASSO) logistic regression, the support vector machine-recursive feature elimination (SVM-RFE), and the Random Forest (RF). The IS diagnostic models were constructed based on transcriptomics by machine learning and artificial neural network (ANN). RESULTS: A total of 69 DEGs, mainly involved in immune and inflammatory responses, were identified. The pathways enriched in the IS group were complement and coagulation cascades, lysosome, PPAR signaling pathway, regulation of autophagy, and toll-like receptor signaling pathway. The feature genes selected by LASSO, SVM-RFE, and RF were 17, 10, and 12, respectively. The area under the curve (AUC) of the LASSO model in the training dataset, GSE22255, and GSE195442 was 0.969, 0.890, and 1.000. The AUC of the SVM-RFE model was 0.957, 0.805, and 1.000, respectively. The AUC of the RF model was 0.947, 0.935, and 1.000, respectively. The models have good sensitivity, specificity, and accuracy. The AUC of the LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 1.000, 0.995, and 0.997, respectively, in the training dataset. However, the AUC of LASSO+ANN, SVM-RFE+ANN, and RF+ANN models was 0.688, 0.605, and 0.619, respectively, in the GSE22255 dataset. The AUC of the LASSO+ANN and RF+ANN models was 0.740 and 0.630, respectively, in the GSE195442 dataset. In the training dataset, the sensitivity, specificity, and accuracy of the LASSO+ANN model were 1.000, 1.000, and 1.000, respectively; of the SVM-RFE+ANN model were 0.946, 0.982, and 0.964, respectively; and of the RF+ANN model were 0.964, 1.000, and 0.982, respectively. In the test datasets, the sensitivity was very satisfactory; however, the specificity and accuracy were not good. CONCLUSION: The LASSO, SVM-RFE, and RF models have good prediction abilities. However, the ANN model is efficient at classifying positive samples and is unsuitable at classifying negative samples. Frontiers Media S.A. 2022-12-05 /pmc/articles/PMC9762505/ /pubmed/36545400 http://dx.doi.org/10.3389/fneur.2022.1014346 Text en Copyright © 2022 Yang, Wang, Pan, Xie, Lu and You. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neurology Yang, Wan-Xia Wang, Fang-Fang Pan, Yun-Yan Xie, Jian-Qin Lu, Ming-Hua You, Chong-Ge Comparison of ischemic stroke diagnosis models based on machine learning |
title | Comparison of ischemic stroke diagnosis models based on machine learning |
title_full | Comparison of ischemic stroke diagnosis models based on machine learning |
title_fullStr | Comparison of ischemic stroke diagnosis models based on machine learning |
title_full_unstemmed | Comparison of ischemic stroke diagnosis models based on machine learning |
title_short | Comparison of ischemic stroke diagnosis models based on machine learning |
title_sort | comparison of ischemic stroke diagnosis models based on machine learning |
topic | Neurology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762505/ https://www.ncbi.nlm.nih.gov/pubmed/36545400 http://dx.doi.org/10.3389/fneur.2022.1014346 |
work_keys_str_mv | AT yangwanxia comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT wangfangfang comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT panyunyan comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT xiejianqin comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT luminghua comparisonofischemicstrokediagnosismodelsbasedonmachinelearning AT youchongge comparisonofischemicstrokediagnosismodelsbasedonmachinelearning |