Cargando…

CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training

BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen du...

Descripción completa

Detalles Bibliográficos
Autores principales: Mostavi, Milad, Chiu, Yu-Chiao, Chen, Yidong, Huang, Yufei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117642/
https://www.ncbi.nlm.nih.gov/pubmed/33980137
http://dx.doi.org/10.1186/s12859-021-04157-w
_version_ 1783691623402569728
author Mostavi, Milad
Chiu, Yu-Chiao
Chen, Yidong
Huang, Yufei
author_facet Mostavi, Milad
Chiu, Yu-Chiao
Chen, Yidong
Huang, Yufei
author_sort Mostavi, Milad
collection PubMed
description BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. RESULTS: We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. CONCLUSION: This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04157-w.
format Online
Article
Text
id pubmed-8117642
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81176422021-05-17 CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training Mostavi, Milad Chiu, Yu-Chiao Chen, Yidong Huang, Yufei BMC Bioinformatics Research Article BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. RESULTS: We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. CONCLUSION: This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04157-w. BioMed Central 2021-05-12 /pmc/articles/PMC8117642/ /pubmed/33980137 http://dx.doi.org/10.1186/s12859-021-04157-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Mostavi, Milad
Chiu, Yu-Chiao
Chen, Yidong
Huang, Yufei
CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
title CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
title_full CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
title_fullStr CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
title_full_unstemmed CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
title_short CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
title_sort cancersiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117642/
https://www.ncbi.nlm.nih.gov/pubmed/33980137
http://dx.doi.org/10.1186/s12859-021-04157-w
work_keys_str_mv AT mostavimilad cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining
AT chiuyuchiao cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining
AT chenyidong cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining
AT huangyufei cancersiameseoneshotlearningforpredictingprimaryandmetastatictumortypesunseenduringmodeltraining