Cargando…
The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, howeve...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8807116/ https://www.ncbi.nlm.nih.gov/pubmed/35118386 http://dx.doi.org/10.1093/narcan/zcac001 |
_version_ | 1784643613547823104 |
---|---|
author | Wang, Yunfeng Xue, Haoliang Aglave, Marine Lainé, Antoine Gallopin, Mélina Gautheret, Daniel |
author_facet | Wang, Yunfeng Xue, Haoliang Aglave, Marine Lainé, Antoine Gallopin, Mélina Gautheret, Daniel |
author_sort | Wang, Yunfeng |
collection | PubMed |
description | The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, however, is the difficulty of distinguishing signal from noise. Here, we use two independent lung adenocarcinoma datasets to identify all reproducible events at the k-mer level, in a tumor versus normal setting. We find reproducible events in many different locations (introns, intergenic, repeats) and forms (spliced, polyadenylated, chimeric etc.). We systematically analyze events that are ignored in conventional transcriptomics and assess their value as biomarkers and for tumor classification, survival prediction, neoantigen prediction and correlation with the immune microenvironment. We find that unannotated lincRNAs, novel splice variants, endogenous HERV, Line1 and Alu repeats and bacterial RNAs each contribute to different, important aspects of tumor identity. We argue that differential RNA-seq analysis of tumor/normal sample collections would benefit from this type k-mer analysis to cast a wider net on important cancer-related events. The code is available at https://github.com/Transipedia/dekupl-lung-cancer-inter-cohort. |
format | Online Article Text |
id | pubmed-8807116 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-88071162022-02-02 The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma Wang, Yunfeng Xue, Haoliang Aglave, Marine Lainé, Antoine Gallopin, Mélina Gautheret, Daniel NAR Cancer Cancer Computational Biology The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, however, is the difficulty of distinguishing signal from noise. Here, we use two independent lung adenocarcinoma datasets to identify all reproducible events at the k-mer level, in a tumor versus normal setting. We find reproducible events in many different locations (introns, intergenic, repeats) and forms (spliced, polyadenylated, chimeric etc.). We systematically analyze events that are ignored in conventional transcriptomics and assess their value as biomarkers and for tumor classification, survival prediction, neoantigen prediction and correlation with the immune microenvironment. We find that unannotated lincRNAs, novel splice variants, endogenous HERV, Line1 and Alu repeats and bacterial RNAs each contribute to different, important aspects of tumor identity. We argue that differential RNA-seq analysis of tumor/normal sample collections would benefit from this type k-mer analysis to cast a wider net on important cancer-related events. The code is available at https://github.com/Transipedia/dekupl-lung-cancer-inter-cohort. Oxford University Press 2022-02-01 /pmc/articles/PMC8807116/ /pubmed/35118386 http://dx.doi.org/10.1093/narcan/zcac001 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of NAR Cancer. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Cancer Computational Biology Wang, Yunfeng Xue, Haoliang Aglave, Marine Lainé, Antoine Gallopin, Mélina Gautheret, Daniel The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma |
title | The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma |
title_full | The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma |
title_fullStr | The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma |
title_full_unstemmed | The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma |
title_short | The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma |
title_sort | contribution of uncharted rna sequences to tumor identity in lung adenocarcinoma |
topic | Cancer Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8807116/ https://www.ncbi.nlm.nih.gov/pubmed/35118386 http://dx.doi.org/10.1093/narcan/zcac001 |
work_keys_str_mv | AT wangyunfeng thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT xuehaoliang thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT aglavemarine thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT laineantoine thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT gallopinmelina thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT gautheretdaniel thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT wangyunfeng contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT xuehaoliang contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT aglavemarine contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT laineantoine contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT gallopinmelina contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma AT gautheretdaniel contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma |