Cargando…

Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways

Lung adenocarcinoma (LUAD) is one of the most common malignant tumors. How to effectively diagnose LUAD at an early stage and make an accurate judgement of the occurrence and progression of LUAD are still the focus of current research. Support vector machine (SVM) is one of the most effective method...

Descripción completa

Detalles Bibliográficos
Autores principales: Di, Feng, He, Chunxiao, Pu, Guimei, Zhang, Chunyi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7341118/
https://www.ncbi.nlm.nih.gov/pubmed/32444360
http://dx.doi.org/10.1534/g3.120.401207
_version_ 1783555166652334080
author Di, Feng
He, Chunxiao
Pu, Guimei
Zhang, Chunyi
author_facet Di, Feng
He, Chunxiao
Pu, Guimei
Zhang, Chunyi
author_sort Di, Feng
collection PubMed
description Lung adenocarcinoma (LUAD) is one of the most common malignant tumors. How to effectively diagnose LUAD at an early stage and make an accurate judgement of the occurrence and progression of LUAD are still the focus of current research. Support vector machine (SVM) is one of the most effective methods for diagnosing LUAD of different stages. The study aimed to explore the dynamic change of differentially expressed genes (DEGs) in different stages of LUAD, and to assess the risk of LUAD through DEGs enriched pathways and establish a diagnostic model based on SVM method. Based on TMN stages and gene expression profiles of 517 samples in TCGA-LUAD database, coefficient of variation (CV) combined with one-way analysis of variance (ANOVA) were used to screen out feature genes in different TMN stages after data standardization. Unsupervised clustering analysis was conducted on samples and feature genes. The feature genes were analyzed by Pearson correlation coefficient to construct a co-expression network. Fisher exact test was conducted to verify the most enriched pathways, and the variation of each pathway in different stages was analyzed. SVM networks were trained and ROC curves were drawn based on the predicted results so as to evaluate the predictive effectiveness of the SVM model. Unsupervised hierarchical clustering analysis results showed that almost all the samples in stage III/IV were clustered together, while samples in stage I/II were clustered together. The correlation of feature genes in different stages was different. In addition, with the increase of malignant degree of lung cancer, the average shortest path of the network gradually increased, while the closeness centrality gradually decreased. Finally, four feature pathways that could distinguish different stages of LUAD were obtained and the ability was tested by the SVM model with an accuracy of 91%. Functional level differences were quantified based on the expression of feature genes in lung cancer patients of different stages, so as to help the diagnosis and prediction of lung cancer. The accuracy of our model in differentiating between stage I/II and stage III/IV could reach 91%.
format Online
Article
Text
id pubmed-7341118
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-73411182020-07-21 Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways Di, Feng He, Chunxiao Pu, Guimei Zhang, Chunyi G3 (Bethesda) Investigations Lung adenocarcinoma (LUAD) is one of the most common malignant tumors. How to effectively diagnose LUAD at an early stage and make an accurate judgement of the occurrence and progression of LUAD are still the focus of current research. Support vector machine (SVM) is one of the most effective methods for diagnosing LUAD of different stages. The study aimed to explore the dynamic change of differentially expressed genes (DEGs) in different stages of LUAD, and to assess the risk of LUAD through DEGs enriched pathways and establish a diagnostic model based on SVM method. Based on TMN stages and gene expression profiles of 517 samples in TCGA-LUAD database, coefficient of variation (CV) combined with one-way analysis of variance (ANOVA) were used to screen out feature genes in different TMN stages after data standardization. Unsupervised clustering analysis was conducted on samples and feature genes. The feature genes were analyzed by Pearson correlation coefficient to construct a co-expression network. Fisher exact test was conducted to verify the most enriched pathways, and the variation of each pathway in different stages was analyzed. SVM networks were trained and ROC curves were drawn based on the predicted results so as to evaluate the predictive effectiveness of the SVM model. Unsupervised hierarchical clustering analysis results showed that almost all the samples in stage III/IV were clustered together, while samples in stage I/II were clustered together. The correlation of feature genes in different stages was different. In addition, with the increase of malignant degree of lung cancer, the average shortest path of the network gradually increased, while the closeness centrality gradually decreased. Finally, four feature pathways that could distinguish different stages of LUAD were obtained and the ability was tested by the SVM model with an accuracy of 91%. Functional level differences were quantified based on the expression of feature genes in lung cancer patients of different stages, so as to help the diagnosis and prediction of lung cancer. The accuracy of our model in differentiating between stage I/II and stage III/IV could reach 91%. Genetics Society of America 2020-05-22 /pmc/articles/PMC7341118/ /pubmed/32444360 http://dx.doi.org/10.1534/g3.120.401207 Text en Copyright © 2020 Di et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Di, Feng
He, Chunxiao
Pu, Guimei
Zhang, Chunyi
Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways
title Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways
title_full Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways
title_fullStr Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways
title_full_unstemmed Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways
title_short Support Vector Machine for Lung Adenocarcinoma Staging Through Variant Pathways
title_sort support vector machine for lung adenocarcinoma staging through variant pathways
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7341118/
https://www.ncbi.nlm.nih.gov/pubmed/32444360
http://dx.doi.org/10.1534/g3.120.401207
work_keys_str_mv AT difeng supportvectormachineforlungadenocarcinomastagingthroughvariantpathways
AT hechunxiao supportvectormachineforlungadenocarcinomastagingthroughvariantpathways
AT puguimei supportvectormachineforlungadenocarcinomastagingthroughvariantpathways
AT zhangchunyi supportvectormachineforlungadenocarcinomastagingthroughvariantpathways