Cargando…
CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen du...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117642/ https://www.ncbi.nlm.nih.gov/pubmed/33980137 http://dx.doi.org/10.1186/s12859-021-04157-w |
_version_ | 1783691623402569728 |
---|---|
author | Mostavi, Milad Chiu, Yu-Chiao Chen, Yidong Huang, Yufei |
author_facet | Mostavi, Milad Chiu, Yu-Chiao Chen, Yidong Huang, Yufei |
author_sort | Mostavi, Milad |
collection | PubMed |
description | BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. RESULTS: We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. CONCLUSION: This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04157-w. |
format | Online Article Text |
id | pubmed-8117642 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-81176422021-05-17 CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training Mostavi, Milad Chiu, Yu-Chiao Chen, Yidong Huang, Yufei BMC Bioinformatics Research Article BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. RESULTS: We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. CONCLUSION: This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04157-w. BioMed Central 2021-05-12 /pmc/articles/PMC8117642/ /pubmed/33980137 http://dx.doi.org/10.1186/s12859-021-04157-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Mostavi, Milad Chiu, Yu-Chiao Chen, Yidong Huang, Yufei CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
title | CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
title_full | CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
title_fullStr | CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
title_full_unstemmed | CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
title_short | CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
title_sort | cancersiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117642/ https://www.ncbi.nlm.nih.gov/pubmed/33980137 http://dx.doi.org/10.1186/s12859-021-04157-w |
work_keys_str_mv | AT mostavimilad cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining AT chiuyuchiao cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining AT chenyidong cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining AT huangyufei cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining |