Cargando…
Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma
BACKGROUND: One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629551/ https://www.ncbi.nlm.nih.gov/pubmed/28984190 http://dx.doi.org/10.1186/s12859-017-1798-2 |
_version_ | 1783269065205219328 |
---|---|
author | Young, Jonathan D. Cai, Chunhui Lu, Xinghua |
author_facet | Young, Jonathan D. Cai, Chunhui Lu, Xinghua |
author_sort | Young, Jonathan D. |
collection | PubMed |
description | BACKGROUND: One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep learning is a group of machine learning algorithms that use multiple layers of hidden units to capture hierarchically related, alternative representations of the input data. We hypothesize that this hierarchical structure learned by deep learning will be related to the cellular signaling system. RESULTS: Robust deep learning model selection identified a network architecture that is biologically plausible. Our model selection results indicated that the 1st hidden layer of our deep learning model should contain about 1300 hidden units to most effectively capture the covariance structure of the input data. This agrees with the estimated number of human transcription factors, which is approximately 1400. This result lends support to our hypothesis that the 1st hidden layer of a deep learning model trained on gene expression data may represent signals related to transcription factor activation. Using the 3rd hidden layer representation of each tumor as learned by our unsupervised deep learning model, we performed consensus clustering on all tumor samples—leading to the discovery of clusters of glioblastoma multiforme with differential survival. One of these clusters contained all of the glioblastoma samples with G-CIMP, a known methylation phenotype driven by the IDH1 mutation and associated with favorable prognosis, suggesting that the hidden units in the 3rd hidden layer representations captured a methylation signal without explicitly using methylation data as input. We also found differentially expressed genes and well-known mutations (NF1, IDH1, EGFR) that were uniquely correlated with each of these clusters. Exploring these unique genes and mutations will allow us to further investigate the disease mechanisms underlying each of these clusters. CONCLUSIONS: In summary, we show that a deep learning model can be trained to represent biologically and clinically meaningful abstractions of cancer gene expression data. Understanding what additional relationships these hidden layer abstractions have with the cancer cellular signaling system could have a significant impact on the understanding and treatment of cancer. |
format | Online Article Text |
id | pubmed-5629551 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56295512017-10-13 Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma Young, Jonathan D. Cai, Chunhui Lu, Xinghua BMC Bioinformatics Research BACKGROUND: One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep learning is a group of machine learning algorithms that use multiple layers of hidden units to capture hierarchically related, alternative representations of the input data. We hypothesize that this hierarchical structure learned by deep learning will be related to the cellular signaling system. RESULTS: Robust deep learning model selection identified a network architecture that is biologically plausible. Our model selection results indicated that the 1st hidden layer of our deep learning model should contain about 1300 hidden units to most effectively capture the covariance structure of the input data. This agrees with the estimated number of human transcription factors, which is approximately 1400. This result lends support to our hypothesis that the 1st hidden layer of a deep learning model trained on gene expression data may represent signals related to transcription factor activation. Using the 3rd hidden layer representation of each tumor as learned by our unsupervised deep learning model, we performed consensus clustering on all tumor samples—leading to the discovery of clusters of glioblastoma multiforme with differential survival. One of these clusters contained all of the glioblastoma samples with G-CIMP, a known methylation phenotype driven by the IDH1 mutation and associated with favorable prognosis, suggesting that the hidden units in the 3rd hidden layer representations captured a methylation signal without explicitly using methylation data as input. We also found differentially expressed genes and well-known mutations (NF1, IDH1, EGFR) that were uniquely correlated with each of these clusters. Exploring these unique genes and mutations will allow us to further investigate the disease mechanisms underlying each of these clusters. CONCLUSIONS: In summary, we show that a deep learning model can be trained to represent biologically and clinically meaningful abstractions of cancer gene expression data. Understanding what additional relationships these hidden layer abstractions have with the cancer cellular signaling system could have a significant impact on the understanding and treatment of cancer. BioMed Central 2017-10-03 /pmc/articles/PMC5629551/ /pubmed/28984190 http://dx.doi.org/10.1186/s12859-017-1798-2 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Young, Jonathan D. Cai, Chunhui Lu, Xinghua Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
title | Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
title_full | Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
title_fullStr | Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
title_full_unstemmed | Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
title_short | Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
title_sort | unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629551/ https://www.ncbi.nlm.nih.gov/pubmed/28984190 http://dx.doi.org/10.1186/s12859-017-1798-2 |
work_keys_str_mv | AT youngjonathand unsuperviseddeeplearningrevealsprognosticallyrelevantsubtypesofglioblastoma AT caichunhui unsuperviseddeeplearningrevealsprognosticallyrelevantsubtypesofglioblastoma AT luxinghua unsuperviseddeeplearningrevealsprognosticallyrelevantsubtypesofglioblastoma |