Cargando…

Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers

BACKGROUND: Long-term survival in numerous cancers often correlates with specific whole transcriptome profiles or the expression patterns of smaller numbers of transcripts. In some instances, these are better predictors of survival than are standard classification methods such as clinical stage or h...

Descripción completa

Detalles Bibliográficos
Autores principales: Mandel, Jordan, Avula, Raghunandan, Prochownik, Edward V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7140376/
https://www.ncbi.nlm.nih.gov/pubmed/32264880
http://dx.doi.org/10.1186/s12885-020-06756-x
_version_ 1783518977429864448
author Mandel, Jordan
Avula, Raghunandan
Prochownik, Edward V.
author_facet Mandel, Jordan
Avula, Raghunandan
Prochownik, Edward V.
author_sort Mandel, Jordan
collection PubMed
description BACKGROUND: Long-term survival in numerous cancers often correlates with specific whole transcriptome profiles or the expression patterns of smaller numbers of transcripts. In some instances, these are better predictors of survival than are standard classification methods such as clinical stage or hormone receptor status in breast cancer. Here, we have used the method of “t-distributed stochastic neighbor embedding” (t-SNE) to show that, collectively, the expression patterns of small numbers of functionally-related transcripts from fifteen cancer pathways correlate with long-term survival in the vast majority of tumor types from The Cancer Genome Atlas (TCGA). We then ask whether the sequential application of t-SNE using the transcripts from a second pathway improves predictive capability or whether t-SNE can be used to refine the initial predictive power of whole transcriptome profiling. METHODS: RNAseq data from 10,227 tumors in TCGA were previously analyzed using t-SNE-based clustering of 362 transcripts comprising 15 distinct cancer-related pathways. After showing that certain clusters were associated with differential survival, each relevant cluster was re-analyzed by t-SNE with a second pathway’s transcripts. Alternatively, groups with differential survival identified by whole transcriptome profiling were subject to a second, t-SNE-based analysis. RESULTS: Sequential analyses employing either t-SNE➔t-SNE or whole transcriptome profiling➔t-SNE analyses were in many cases superior to either individual method at predicting long-term survival. We developed a dynamic and intuitive R Shiny web application to explore the t-SNE based transcriptome clustering and survival analysis across all TCGA cancers and all 15 cancer-related pathways in this analysis. This application provides a simple interface to select specific t-SNE clusters and analyze survival predictability using both individual or sequential approaches. The user can recreate the relationships described in this analysis and further explore many different cancer, pathway, and cluster combinations. Non-R users can access the application on the web at https://chpupsom19.shinyapps.io/Survival_Analysis_tsne_umap_TCGA. The application, R scripts performing survival analysis, and t-SNE clustering results of TCGA expression data can be accessed on GitHub enabling users to download and run the application locally with ease (https://github.com/RavulaPitt/Sequential-t-SNE/). CONCLUSIONS: The long-term survival of patients correlated with expression patterns of 362 transcripts from 15 cancer-related pathways. In numerous cases, however, survival could be further improved when the cohorts were re-analyzed using iterative t-SNE clustering or when t-SNE clustering was applied to cohorts initially segregated by whole transcriptome-based hierarchical clustering.
format Online
Article
Text
id pubmed-7140376
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71403762020-04-14 Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers Mandel, Jordan Avula, Raghunandan Prochownik, Edward V. BMC Cancer Research Article BACKGROUND: Long-term survival in numerous cancers often correlates with specific whole transcriptome profiles or the expression patterns of smaller numbers of transcripts. In some instances, these are better predictors of survival than are standard classification methods such as clinical stage or hormone receptor status in breast cancer. Here, we have used the method of “t-distributed stochastic neighbor embedding” (t-SNE) to show that, collectively, the expression patterns of small numbers of functionally-related transcripts from fifteen cancer pathways correlate with long-term survival in the vast majority of tumor types from The Cancer Genome Atlas (TCGA). We then ask whether the sequential application of t-SNE using the transcripts from a second pathway improves predictive capability or whether t-SNE can be used to refine the initial predictive power of whole transcriptome profiling. METHODS: RNAseq data from 10,227 tumors in TCGA were previously analyzed using t-SNE-based clustering of 362 transcripts comprising 15 distinct cancer-related pathways. After showing that certain clusters were associated with differential survival, each relevant cluster was re-analyzed by t-SNE with a second pathway’s transcripts. Alternatively, groups with differential survival identified by whole transcriptome profiling were subject to a second, t-SNE-based analysis. RESULTS: Sequential analyses employing either t-SNE➔t-SNE or whole transcriptome profiling➔t-SNE analyses were in many cases superior to either individual method at predicting long-term survival. We developed a dynamic and intuitive R Shiny web application to explore the t-SNE based transcriptome clustering and survival analysis across all TCGA cancers and all 15 cancer-related pathways in this analysis. This application provides a simple interface to select specific t-SNE clusters and analyze survival predictability using both individual or sequential approaches. The user can recreate the relationships described in this analysis and further explore many different cancer, pathway, and cluster combinations. Non-R users can access the application on the web at https://chpupsom19.shinyapps.io/Survival_Analysis_tsne_umap_TCGA. The application, R scripts performing survival analysis, and t-SNE clustering results of TCGA expression data can be accessed on GitHub enabling users to download and run the application locally with ease (https://github.com/RavulaPitt/Sequential-t-SNE/). CONCLUSIONS: The long-term survival of patients correlated with expression patterns of 362 transcripts from 15 cancer-related pathways. In numerous cases, however, survival could be further improved when the cohorts were re-analyzed using iterative t-SNE clustering or when t-SNE clustering was applied to cohorts initially segregated by whole transcriptome-based hierarchical clustering. BioMed Central 2020-04-07 /pmc/articles/PMC7140376/ /pubmed/32264880 http://dx.doi.org/10.1186/s12885-020-06756-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Mandel, Jordan
Avula, Raghunandan
Prochownik, Edward V.
Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
title Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
title_full Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
title_fullStr Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
title_full_unstemmed Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
title_short Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
title_sort sequential analysis of transcript expression patterns improves survival prediction in multiple cancers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7140376/
https://www.ncbi.nlm.nih.gov/pubmed/32264880
http://dx.doi.org/10.1186/s12885-020-06756-x
work_keys_str_mv AT mandeljordan sequentialanalysisoftranscriptexpressionpatternsimprovessurvivalpredictioninmultiplecancers
AT avularaghunandan sequentialanalysisoftranscriptexpressionpatternsimprovessurvivalpredictioninmultiplecancers
AT prochownikedwardv sequentialanalysisoftranscriptexpressionpatternsimprovessurvivalpredictioninmultiplecancers