Cargando…

Learning a Latent Space of Highly Multidimensional Cancer Data

We introduce a Unified Disentanglement Network (UFDN) trained on The Cancer Genome Atlas (TCGA), which we refer to as UFDN-TCGA. We demonstrate that UFDN-TCGA learns a biologically relevant, low-dimensional latent space of high-dimensional gene expression data by applying our network to two classifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Kompa, Benjamin, Coker, Beau
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6934353/
https://www.ncbi.nlm.nih.gov/pubmed/31797612
_version_ 1783483374037368832
author Kompa, Benjamin
Coker, Beau
author_facet Kompa, Benjamin
Coker, Beau
author_sort Kompa, Benjamin
collection PubMed
description We introduce a Unified Disentanglement Network (UFDN) trained on The Cancer Genome Atlas (TCGA), which we refer to as UFDN-TCGA. We demonstrate that UFDN-TCGA learns a biologically relevant, low-dimensional latent space of high-dimensional gene expression data by applying our network to two classification tasks of cancer status and cancer type. UFDN-TCGA performs comparably to random forest methods. The UFDN allows for continuous, partial interpolation between distinct cancer types. Furthermore, we perform an analysis of differentially expressed genes between skin cutaneous melanoma (SKCM) samples and the same samples interpolated into glioblastoma (GBM). We demonstrate that our interpolations consist of relevant metagenes that recapitulate known glioblastoma mechanisms.
format Online
Article
Text
id pubmed-6934353
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-69343532020-01-01 Learning a Latent Space of Highly Multidimensional Cancer Data Kompa, Benjamin Coker, Beau Pac Symp Biocomput Article We introduce a Unified Disentanglement Network (UFDN) trained on The Cancer Genome Atlas (TCGA), which we refer to as UFDN-TCGA. We demonstrate that UFDN-TCGA learns a biologically relevant, low-dimensional latent space of high-dimensional gene expression data by applying our network to two classification tasks of cancer status and cancer type. UFDN-TCGA performs comparably to random forest methods. The UFDN allows for continuous, partial interpolation between distinct cancer types. Furthermore, we perform an analysis of differentially expressed genes between skin cutaneous melanoma (SKCM) samples and the same samples interpolated into glioblastoma (GBM). We demonstrate that our interpolations consist of relevant metagenes that recapitulate known glioblastoma mechanisms. 2020 /pmc/articles/PMC6934353/ /pubmed/31797612 Text en http://creativecommons.org/licenses/by-nc/4.0/ Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC) 4.0 License.
spellingShingle Article
Kompa, Benjamin
Coker, Beau
Learning a Latent Space of Highly Multidimensional Cancer Data
title Learning a Latent Space of Highly Multidimensional Cancer Data
title_full Learning a Latent Space of Highly Multidimensional Cancer Data
title_fullStr Learning a Latent Space of Highly Multidimensional Cancer Data
title_full_unstemmed Learning a Latent Space of Highly Multidimensional Cancer Data
title_short Learning a Latent Space of Highly Multidimensional Cancer Data
title_sort learning a latent space of highly multidimensional cancer data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6934353/
https://www.ncbi.nlm.nih.gov/pubmed/31797612
work_keys_str_mv AT kompabenjamin learningalatentspaceofhighlymultidimensionalcancerdata
AT cokerbeau learningalatentspaceofhighlymultidimensionalcancerdata