Cargando…

The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma

The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yunfeng, Xue, Haoliang, Aglave, Marine, Lainé, Antoine, Gallopin, Mélina, Gautheret, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8807116/
https://www.ncbi.nlm.nih.gov/pubmed/35118386
http://dx.doi.org/10.1093/narcan/zcac001
_version_ 1784643613547823104
author Wang, Yunfeng
Xue, Haoliang
Aglave, Marine
Lainé, Antoine
Gallopin, Mélina
Gautheret, Daniel
author_facet Wang, Yunfeng
Xue, Haoliang
Aglave, Marine
Lainé, Antoine
Gallopin, Mélina
Gautheret, Daniel
author_sort Wang, Yunfeng
collection PubMed
description The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, however, is the difficulty of distinguishing signal from noise. Here, we use two independent lung adenocarcinoma datasets to identify all reproducible events at the k-mer level, in a tumor versus normal setting. We find reproducible events in many different locations (introns, intergenic, repeats) and forms (spliced, polyadenylated, chimeric etc.). We systematically analyze events that are ignored in conventional transcriptomics and assess their value as biomarkers and for tumor classification, survival prediction, neoantigen prediction and correlation with the immune microenvironment. We find that unannotated lincRNAs, novel splice variants, endogenous HERV, Line1 and Alu repeats and bacterial RNAs each contribute to different, important aspects of tumor identity. We argue that differential RNA-seq analysis of tumor/normal sample collections would benefit from this type k-mer analysis to cast a wider net on important cancer-related events. The code is available at https://github.com/Transipedia/dekupl-lung-cancer-inter-cohort.
format Online
Article
Text
id pubmed-8807116
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-88071162022-02-02 The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma Wang, Yunfeng Xue, Haoliang Aglave, Marine Lainé, Antoine Gallopin, Mélina Gautheret, Daniel NAR Cancer Cancer Computational Biology The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, however, is the difficulty of distinguishing signal from noise. Here, we use two independent lung adenocarcinoma datasets to identify all reproducible events at the k-mer level, in a tumor versus normal setting. We find reproducible events in many different locations (introns, intergenic, repeats) and forms (spliced, polyadenylated, chimeric etc.). We systematically analyze events that are ignored in conventional transcriptomics and assess their value as biomarkers and for tumor classification, survival prediction, neoantigen prediction and correlation with the immune microenvironment. We find that unannotated lincRNAs, novel splice variants, endogenous HERV, Line1 and Alu repeats and bacterial RNAs each contribute to different, important aspects of tumor identity. We argue that differential RNA-seq analysis of tumor/normal sample collections would benefit from this type k-mer analysis to cast a wider net on important cancer-related events. The code is available at https://github.com/Transipedia/dekupl-lung-cancer-inter-cohort. Oxford University Press 2022-02-01 /pmc/articles/PMC8807116/ /pubmed/35118386 http://dx.doi.org/10.1093/narcan/zcac001 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of NAR Cancer. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Cancer Computational Biology
Wang, Yunfeng
Xue, Haoliang
Aglave, Marine
Lainé, Antoine
Gallopin, Mélina
Gautheret, Daniel
The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
title The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
title_full The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
title_fullStr The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
title_full_unstemmed The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
title_short The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma
title_sort contribution of uncharted rna sequences to tumor identity in lung adenocarcinoma
topic Cancer Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8807116/
https://www.ncbi.nlm.nih.gov/pubmed/35118386
http://dx.doi.org/10.1093/narcan/zcac001
work_keys_str_mv AT wangyunfeng thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT xuehaoliang thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT aglavemarine thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT laineantoine thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT gallopinmelina thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT gautheretdaniel thecontributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT wangyunfeng contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT xuehaoliang contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT aglavemarine contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT laineantoine contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT gallopinmelina contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma
AT gautheretdaniel contributionofunchartedrnasequencestotumoridentityinlungadenocarcinoma