Cargando…
Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning
Identifying cancer-related differentially expressed genes provides significant information for diagnosing tumors, predicting prognoses, and effective treatments. Recently, deep learning methods have been used to perform gene differential expression analysis using microarray-based high-throughput gen...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Biophysics Reports Editorial Office
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10518520/ https://www.ncbi.nlm.nih.gov/pubmed/37753058 http://dx.doi.org/10.52601/bpr.2023.230005 |
_version_ | 1785109532278521856 |
---|---|
author | Chen, Peng Li, Zhenlei Hong, Zhaolin Zheng, Haoran Zeng, Rong |
author_facet | Chen, Peng Li, Zhenlei Hong, Zhaolin Zheng, Haoran Zeng, Rong |
author_sort | Chen, Peng |
collection | PubMed |
description | Identifying cancer-related differentially expressed genes provides significant information for diagnosing tumors, predicting prognoses, and effective treatments. Recently, deep learning methods have been used to perform gene differential expression analysis using microarray-based high-throughput gene profiling and have achieved good results. In this study, we proposed a new robust multiple-datasets-based semi-supervised learning model, MSSL, to perform tumor type classification and candidate cancer-specific biomarkers discovery across multiple tumor types and multiple datasets, which addressed the following long-lasting obstacles: (1) the data volume of the existing single dataset is not enough to fully exert the advantages of deep learning; (2) a large number of datasets from different research institutions cannot be effectively used due to inconsistent internal variances and low quality; (3) relatively uncommon cancers have limited effects on deep learning methods. In our article, we applied MSSL to The Cancer Genome Atlas (TCGA) and the Gene Expression Comprehensive Database (GEO) pan-cancer normalized-level3 RNA-seq data and got 97.6% final classification accuracy, which had a significant performance leap compared with previous approaches. Finally, we got the ranking of the importance of the corresponding genes for each cancer type based on classification results and validated that the top genes selected in this way were biologically meaningful for corresponding tumors and some of them had been used as biomarkers, which showed the efficacy of our method. |
format | Online Article Text |
id | pubmed-10518520 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Biophysics Reports Editorial Office |
record_format | MEDLINE/PubMed |
spelling | pubmed-105185202023-09-26 Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning Chen, Peng Li, Zhenlei Hong, Zhaolin Zheng, Haoran Zeng, Rong Biophys Rep Method Identifying cancer-related differentially expressed genes provides significant information for diagnosing tumors, predicting prognoses, and effective treatments. Recently, deep learning methods have been used to perform gene differential expression analysis using microarray-based high-throughput gene profiling and have achieved good results. In this study, we proposed a new robust multiple-datasets-based semi-supervised learning model, MSSL, to perform tumor type classification and candidate cancer-specific biomarkers discovery across multiple tumor types and multiple datasets, which addressed the following long-lasting obstacles: (1) the data volume of the existing single dataset is not enough to fully exert the advantages of deep learning; (2) a large number of datasets from different research institutions cannot be effectively used due to inconsistent internal variances and low quality; (3) relatively uncommon cancers have limited effects on deep learning methods. In our article, we applied MSSL to The Cancer Genome Atlas (TCGA) and the Gene Expression Comprehensive Database (GEO) pan-cancer normalized-level3 RNA-seq data and got 97.6% final classification accuracy, which had a significant performance leap compared with previous approaches. Finally, we got the ranking of the importance of the corresponding genes for each cancer type based on classification results and validated that the top genes selected in this way were biologically meaningful for corresponding tumors and some of them had been used as biomarkers, which showed the efficacy of our method. Biophysics Reports Editorial Office 2023-04-30 /pmc/articles/PMC10518520/ /pubmed/37753058 http://dx.doi.org/10.52601/bpr.2023.230005 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Method Chen, Peng Li, Zhenlei Hong, Zhaolin Zheng, Haoran Zeng, Rong Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
title | Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
title_full | Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
title_fullStr | Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
title_full_unstemmed | Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
title_short | Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
title_sort | tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10518520/ https://www.ncbi.nlm.nih.gov/pubmed/37753058 http://dx.doi.org/10.52601/bpr.2023.230005 |
work_keys_str_mv | AT chenpeng tumortypeclassificationandcandidatecancerspecificbiomarkersdiscoveryviasemisupervisedlearning AT lizhenlei tumortypeclassificationandcandidatecancerspecificbiomarkersdiscoveryviasemisupervisedlearning AT hongzhaolin tumortypeclassificationandcandidatecancerspecificbiomarkersdiscoveryviasemisupervisedlearning AT zhenghaoran tumortypeclassificationandcandidatecancerspecificbiomarkersdiscoveryviasemisupervisedlearning AT zengrong tumortypeclassificationandcandidatecancerspecificbiomarkersdiscoveryviasemisupervisedlearning |