Cargando…
Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes
We studied the transcriptome landscape of skin cutaneous melanoma (SKCM) using 103 primary tumor samples from TCGA, and measured the expression levels of both protein coding genes and non-coding RNAs (ncRNAs). In particular, we emphasized pseudogenes potentially relevant to this cancer. While catalo...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5725464/ https://www.ncbi.nlm.nih.gov/pubmed/29229974 http://dx.doi.org/10.1038/s41598-017-17337-7 |
_version_ | 1783285528753340416 |
---|---|
author | Capobianco, Enrico Valdes, Camilo Sarti, Samanta Jiang, Zhijie Poliseno, Laura Tsinoremas, Nicolas F. |
author_facet | Capobianco, Enrico Valdes, Camilo Sarti, Samanta Jiang, Zhijie Poliseno, Laura Tsinoremas, Nicolas F. |
author_sort | Capobianco, Enrico |
collection | PubMed |
description | We studied the transcriptome landscape of skin cutaneous melanoma (SKCM) using 103 primary tumor samples from TCGA, and measured the expression levels of both protein coding genes and non-coding RNAs (ncRNAs). In particular, we emphasized pseudogenes potentially relevant to this cancer. While cataloguing the profiles based on the known biotypes, all the employed RNA-Seq methods generated just a small consensus of significant biotypes. We thus designed an approach to reconcile the profiles from all methods following a simple strategy: we selected genes that were confirmed as differentially expressed by the ensemble predictions obtained in a regression model. The main advantages of this approach are: 1) Selection of a high-confidence gene set identifying relevant pathways; 2) Use of a regression model whose covariates embed all method-driven outcomes to predict an averaged profile; 3) Method-specific assessment of prediction power and significance. Furthermore, the approach can be generalized to any biological system for which noisy RNA-Seq profiles are computed. As our analyses concerned bio-annotations of both high-quality protein coding genes and ncRNAs, we considered the associations between pseudogenes and parental genes (targets). Among the candidate targets that were validated, we identified PINK1, which is studied in patients with Parkinson and cancer (especially melanoma). |
format | Online Article Text |
id | pubmed-5725464 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-57254642017-12-13 Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes Capobianco, Enrico Valdes, Camilo Sarti, Samanta Jiang, Zhijie Poliseno, Laura Tsinoremas, Nicolas F. Sci Rep Article We studied the transcriptome landscape of skin cutaneous melanoma (SKCM) using 103 primary tumor samples from TCGA, and measured the expression levels of both protein coding genes and non-coding RNAs (ncRNAs). In particular, we emphasized pseudogenes potentially relevant to this cancer. While cataloguing the profiles based on the known biotypes, all the employed RNA-Seq methods generated just a small consensus of significant biotypes. We thus designed an approach to reconcile the profiles from all methods following a simple strategy: we selected genes that were confirmed as differentially expressed by the ensemble predictions obtained in a regression model. The main advantages of this approach are: 1) Selection of a high-confidence gene set identifying relevant pathways; 2) Use of a regression model whose covariates embed all method-driven outcomes to predict an averaged profile; 3) Method-specific assessment of prediction power and significance. Furthermore, the approach can be generalized to any biological system for which noisy RNA-Seq profiles are computed. As our analyses concerned bio-annotations of both high-quality protein coding genes and ncRNAs, we considered the associations between pseudogenes and parental genes (targets). Among the candidate targets that were validated, we identified PINK1, which is studied in patients with Parkinson and cancer (especially melanoma). Nature Publishing Group UK 2017-12-11 /pmc/articles/PMC5725464/ /pubmed/29229974 http://dx.doi.org/10.1038/s41598-017-17337-7 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Capobianco, Enrico Valdes, Camilo Sarti, Samanta Jiang, Zhijie Poliseno, Laura Tsinoremas, Nicolas F. Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes |
title | Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes |
title_full | Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes |
title_fullStr | Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes |
title_full_unstemmed | Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes |
title_short | Ensemble Modeling Approach Targeting Heterogeneous RNA-Seq data: Application to Melanoma Pseudogenes |
title_sort | ensemble modeling approach targeting heterogeneous rna-seq data: application to melanoma pseudogenes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5725464/ https://www.ncbi.nlm.nih.gov/pubmed/29229974 http://dx.doi.org/10.1038/s41598-017-17337-7 |
work_keys_str_mv | AT capobiancoenrico ensemblemodelingapproachtargetingheterogeneousrnaseqdataapplicationtomelanomapseudogenes AT valdescamilo ensemblemodelingapproachtargetingheterogeneousrnaseqdataapplicationtomelanomapseudogenes AT sartisamanta ensemblemodelingapproachtargetingheterogeneousrnaseqdataapplicationtomelanomapseudogenes AT jiangzhijie ensemblemodelingapproachtargetingheterogeneousrnaseqdataapplicationtomelanomapseudogenes AT polisenolaura ensemblemodelingapproachtargetingheterogeneousrnaseqdataapplicationtomelanomapseudogenes AT tsinoremasnicolasf ensemblemodelingapproachtargetingheterogeneousrnaseqdataapplicationtomelanomapseudogenes |