Cargando…

Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma

BACKGROUND: One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene...

Descripción completa

Detalles Bibliográficos
Autores principales:	Young, Jonathan D., Cai, Chunhui, Lu, Xinghua
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629551/ https://www.ncbi.nlm.nih.gov/pubmed/28984190 http://dx.doi.org/10.1186/s12859-017-1798-2

_version_	1783269065205219328
author	Young, Jonathan D. Cai, Chunhui Lu, Xinghua
author_facet	Young, Jonathan D. Cai, Chunhui Lu, Xinghua
author_sort	Young, Jonathan D.
collection	PubMed
description	BACKGROUND: One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep learning is a group of machine learning algorithms that use multiple layers of hidden units to capture hierarchically related, alternative representations of the input data. We hypothesize that this hierarchical structure learned by deep learning will be related to the cellular signaling system. RESULTS: Robust deep learning model selection identified a network architecture that is biologically plausible. Our model selection results indicated that the 1st hidden layer of our deep learning model should contain about 1300 hidden units to most effectively capture the covariance structure of the input data. This agrees with the estimated number of human transcription factors, which is approximately 1400. This result lends support to our hypothesis that the 1st hidden layer of a deep learning model trained on gene expression data may represent signals related to transcription factor activation. Using the 3rd hidden layer representation of each tumor as learned by our unsupervised deep learning model, we performed consensus clustering on all tumor samples—leading to the discovery of clusters of glioblastoma multiforme with differential survival. One of these clusters contained all of the glioblastoma samples with G-CIMP, a known methylation phenotype driven by the IDH1 mutation and associated with favorable prognosis, suggesting that the hidden units in the 3rd hidden layer representations captured a methylation signal without explicitly using methylation data as input. We also found differentially expressed genes and well-known mutations (NF1, IDH1, EGFR) that were uniquely correlated with each of these clusters. Exploring these unique genes and mutations will allow us to further investigate the disease mechanisms underlying each of these clusters. CONCLUSIONS: In summary, we show that a deep learning model can be trained to represent biologically and clinically meaningful abstractions of cancer gene expression data. Understanding what additional relationships these hidden layer abstractions have with the cancer cellular signaling system could have a significant impact on the understanding and treatment of cancer.
format	Online Article Text
id	pubmed-5629551
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-56295512017-10-13 Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma Young, Jonathan D. Cai, Chunhui Lu, Xinghua BMC Bioinformatics Research BACKGROUND: One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep learning is a group of machine learning algorithms that use multiple layers of hidden units to capture hierarchically related, alternative representations of the input data. We hypothesize that this hierarchical structure learned by deep learning will be related to the cellular signaling system. RESULTS: Robust deep learning model selection identified a network architecture that is biologically plausible. Our model selection results indicated that the 1st hidden layer of our deep learning model should contain about 1300 hidden units to most effectively capture the covariance structure of the input data. This agrees with the estimated number of human transcription factors, which is approximately 1400. This result lends support to our hypothesis that the 1st hidden layer of a deep learning model trained on gene expression data may represent signals related to transcription factor activation. Using the 3rd hidden layer representation of each tumor as learned by our unsupervised deep learning model, we performed consensus clustering on all tumor samples—leading to the discovery of clusters of glioblastoma multiforme with differential survival. One of these clusters contained all of the glioblastoma samples with G-CIMP, a known methylation phenotype driven by the IDH1 mutation and associated with favorable prognosis, suggesting that the hidden units in the 3rd hidden layer representations captured a methylation signal without explicitly using methylation data as input. We also found differentially expressed genes and well-known mutations (NF1, IDH1, EGFR) that were uniquely correlated with each of these clusters. Exploring these unique genes and mutations will allow us to further investigate the disease mechanisms underlying each of these clusters. CONCLUSIONS: In summary, we show that a deep learning model can be trained to represent biologically and clinically meaningful abstractions of cancer gene expression data. Understanding what additional relationships these hidden layer abstractions have with the cancer cellular signaling system could have a significant impact on the understanding and treatment of cancer. BioMed Central 2017-10-03 /pmc/articles/PMC5629551/ /pubmed/28984190 http://dx.doi.org/10.1186/s12859-017-1798-2 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Young, Jonathan D. Cai, Chunhui Lu, Xinghua Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
title	Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
title_full	Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
title_fullStr	Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
title_full_unstemmed	Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
title_short	Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
title_sort	unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629551/ https://www.ncbi.nlm.nih.gov/pubmed/28984190 http://dx.doi.org/10.1186/s12859-017-1798-2
work_keys_str_mv	AT youngjonathand unsuperviseddeeplearningrevealsprognosticallyrelevantsubtypesofglioblastoma AT caichunhui unsuperviseddeeplearningrevealsprognosticallyrelevantsubtypesofglioblastoma AT luxinghua unsuperviseddeeplearningrevealsprognosticallyrelevantsubtypesofglioblastoma

Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma

Ejemplares similares