Cargando…
Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles
Heterogeneity is a fundamental feature of complex phenotypes. So far, genomic screenings have profiled thousands of samples providing insights into the transcriptome of the cell. However, disentangling the heterogeneity of these transcriptomic Big Data to identify defective biological processes rema...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7038995/ https://www.ncbi.nlm.nih.gov/pubmed/31889184 http://dx.doi.org/10.1093/nar/gkz1208 |
_version_ | 1783500746110533632 |
---|---|
author | Lauria, Andrea Peirone, Serena Giudice, Marco Del Priante, Francesca Rajan, Prabhakar Caselle, Michele Oliviero, Salvatore Cereda, Matteo |
author_facet | Lauria, Andrea Peirone, Serena Giudice, Marco Del Priante, Francesca Rajan, Prabhakar Caselle, Michele Oliviero, Salvatore Cereda, Matteo |
author_sort | Lauria, Andrea |
collection | PubMed |
description | Heterogeneity is a fundamental feature of complex phenotypes. So far, genomic screenings have profiled thousands of samples providing insights into the transcriptome of the cell. However, disentangling the heterogeneity of these transcriptomic Big Data to identify defective biological processes remains challenging. Here we present GSECA, a method exploiting the bimodal behavior of RNA-sequencing gene expression profiles to identify altered gene sets in heterogeneous patient cohorts. Using simulated and experimental RNA-sequencing data sets, we show that GSECA provides higher performances than other available algorithms in detecting truly altered biological processes in large cohorts. Applied to 5941 samples from 14 different cancer types, GSECA correctly identified the alteration of the PI3K/AKT signaling pathway driven by the somatic loss of PTEN and verified the emerging role of PTEN in modulating immune-related processes. In particular, we showed that, in prostate cancer, PTEN loss appears to establish an immunosuppressive tumor microenvironment through the activation of STAT3, and low PTEN expression levels have a detrimental impact on patient disease-free survival. GSECA is available at https://github.com/matteocereda/GSECA. |
format | Online Article Text |
id | pubmed-7038995 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-70389952020-03-02 Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles Lauria, Andrea Peirone, Serena Giudice, Marco Del Priante, Francesca Rajan, Prabhakar Caselle, Michele Oliviero, Salvatore Cereda, Matteo Nucleic Acids Res Computational Biology Heterogeneity is a fundamental feature of complex phenotypes. So far, genomic screenings have profiled thousands of samples providing insights into the transcriptome of the cell. However, disentangling the heterogeneity of these transcriptomic Big Data to identify defective biological processes remains challenging. Here we present GSECA, a method exploiting the bimodal behavior of RNA-sequencing gene expression profiles to identify altered gene sets in heterogeneous patient cohorts. Using simulated and experimental RNA-sequencing data sets, we show that GSECA provides higher performances than other available algorithms in detecting truly altered biological processes in large cohorts. Applied to 5941 samples from 14 different cancer types, GSECA correctly identified the alteration of the PI3K/AKT signaling pathway driven by the somatic loss of PTEN and verified the emerging role of PTEN in modulating immune-related processes. In particular, we showed that, in prostate cancer, PTEN loss appears to establish an immunosuppressive tumor microenvironment through the activation of STAT3, and low PTEN expression levels have a detrimental impact on patient disease-free survival. GSECA is available at https://github.com/matteocereda/GSECA. Oxford University Press 2020-02-28 2019-12-31 /pmc/articles/PMC7038995/ /pubmed/31889184 http://dx.doi.org/10.1093/nar/gkz1208 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Computational Biology Lauria, Andrea Peirone, Serena Giudice, Marco Del Priante, Francesca Rajan, Prabhakar Caselle, Michele Oliviero, Salvatore Cereda, Matteo Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles |
title | Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles |
title_full | Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles |
title_fullStr | Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles |
title_full_unstemmed | Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles |
title_short | Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles |
title_sort | identification of altered biological processes in heterogeneous rna-sequencing data by discretization of expression profiles |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7038995/ https://www.ncbi.nlm.nih.gov/pubmed/31889184 http://dx.doi.org/10.1093/nar/gkz1208 |
work_keys_str_mv | AT lauriaandrea identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT peironeserena identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT giudicemarcodel identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT priantefrancesca identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT rajanprabhakar identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT casellemichele identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT olivierosalvatore identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles AT ceredamatteo identificationofalteredbiologicalprocessesinheterogeneousrnasequencingdatabydiscretizationofexpressionprofiles |