Cargando…
Identification of feature risk pathways of smoking-induced lung cancer based on SVM
OBJECTIVE: The present study aims to explore the role of smoking factors in the risk of lung cancer and screen the feature risk pathways of smoking-induced lung cancer. METHODS: The expression profiles of the patient data from GEO database were standardized, and differentially expressed genes (DEGs)...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7272018/ https://www.ncbi.nlm.nih.gov/pubmed/32497048 http://dx.doi.org/10.1371/journal.pone.0233445 |
_version_ | 1783542180196909056 |
---|---|
author | Chen, Rongjun Lin, Jinhui |
author_facet | Chen, Rongjun Lin, Jinhui |
author_sort | Chen, Rongjun |
collection | PubMed |
description | OBJECTIVE: The present study aims to explore the role of smoking factors in the risk of lung cancer and screen the feature risk pathways of smoking-induced lung cancer. METHODS: The expression profiles of the patient data from GEO database were standardized, and differentially expressed genes (DEGs) were analyzed by limma algorithm. Samples and genes were analyzed by Unsupervised hierarchical clustering method, while GO and KEGG enrichment analyses were performed on DEGs. The data of the protein-protein interaction (PPI) network were downloaded from the BioGrid and HPRD databases, and the DEGs were mapped into the PPI network to identify the interaction relationship. The enriched significant pathways were used to calculate the anomaly score and RFE method was used to optimize the feature sets. The model was trained using the support vector machine (SVM) and the predicted results were plotted into ROC curves. The AUC value was calculated to evaluate the predictive performance of the SVM model. RESULTS: A total of 1923 DEGs were obtained, of which 826 were down-regulated and 1097 were up-regulated. Unsupervised hierarchical clustering analysis showed that the diagnosis accuracy of lung cancer smokers was 74%, and that of non-lung cancer smokers was 75%. Five optimal feature pathway sets were obtained by screening, the clinical diagnostic ability of which was detected by SVM model with the accuracy improved to 84%. The diagnostic accuracy was 90% after combining clinical information. CONCLUSION: We verified that five signaling pathways combined with clinical information could be used as a feature risk pathway for identifying lung cancer smokers and non-lung cancer smokers and increased the diagnostic accuracy. |
format | Online Article Text |
id | pubmed-7272018 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-72720182020-06-12 Identification of feature risk pathways of smoking-induced lung cancer based on SVM Chen, Rongjun Lin, Jinhui PLoS One Research Article OBJECTIVE: The present study aims to explore the role of smoking factors in the risk of lung cancer and screen the feature risk pathways of smoking-induced lung cancer. METHODS: The expression profiles of the patient data from GEO database were standardized, and differentially expressed genes (DEGs) were analyzed by limma algorithm. Samples and genes were analyzed by Unsupervised hierarchical clustering method, while GO and KEGG enrichment analyses were performed on DEGs. The data of the protein-protein interaction (PPI) network were downloaded from the BioGrid and HPRD databases, and the DEGs were mapped into the PPI network to identify the interaction relationship. The enriched significant pathways were used to calculate the anomaly score and RFE method was used to optimize the feature sets. The model was trained using the support vector machine (SVM) and the predicted results were plotted into ROC curves. The AUC value was calculated to evaluate the predictive performance of the SVM model. RESULTS: A total of 1923 DEGs were obtained, of which 826 were down-regulated and 1097 were up-regulated. Unsupervised hierarchical clustering analysis showed that the diagnosis accuracy of lung cancer smokers was 74%, and that of non-lung cancer smokers was 75%. Five optimal feature pathway sets were obtained by screening, the clinical diagnostic ability of which was detected by SVM model with the accuracy improved to 84%. The diagnostic accuracy was 90% after combining clinical information. CONCLUSION: We verified that five signaling pathways combined with clinical information could be used as a feature risk pathway for identifying lung cancer smokers and non-lung cancer smokers and increased the diagnostic accuracy. Public Library of Science 2020-06-04 /pmc/articles/PMC7272018/ /pubmed/32497048 http://dx.doi.org/10.1371/journal.pone.0233445 Text en © 2020 Chen, Lin http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chen, Rongjun Lin, Jinhui Identification of feature risk pathways of smoking-induced lung cancer based on SVM |
title | Identification of feature risk pathways of smoking-induced lung cancer based on SVM |
title_full | Identification of feature risk pathways of smoking-induced lung cancer based on SVM |
title_fullStr | Identification of feature risk pathways of smoking-induced lung cancer based on SVM |
title_full_unstemmed | Identification of feature risk pathways of smoking-induced lung cancer based on SVM |
title_short | Identification of feature risk pathways of smoking-induced lung cancer based on SVM |
title_sort | identification of feature risk pathways of smoking-induced lung cancer based on svm |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7272018/ https://www.ncbi.nlm.nih.gov/pubmed/32497048 http://dx.doi.org/10.1371/journal.pone.0233445 |
work_keys_str_mv | AT chenrongjun identificationoffeatureriskpathwaysofsmokinginducedlungcancerbasedonsvm AT linjinhui identificationoffeatureriskpathwaysofsmokinginducedlungcancerbasedonsvm |