Cargando…

Intergenic RNA mainly derives from nascent transcripts of known genes

BACKGROUND: Eukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs origin...

Descripción completa

Detalles Bibliográficos
Autores principales: Agostini, Federico, Zagalak, Julian, Attig, Jan, Ule, Jernej, Luscombe, Nicholas M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097831/
https://www.ncbi.nlm.nih.gov/pubmed/33952325
http://dx.doi.org/10.1186/s13059-021-02350-x
_version_ 1783688392250228736
author Agostini, Federico
Zagalak, Julian
Attig, Jan
Ule, Jernej
Luscombe, Nicholas M.
author_facet Agostini, Federico
Zagalak, Julian
Attig, Jan
Ule, Jernej
Luscombe, Nicholas M.
author_sort Agostini, Federico
collection PubMed
description BACKGROUND: Eukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs originating from intergenic regions in the human genome remain unclear. RESULTS: We hypothesize that many intergenic RNAs can be ascribed to the presence of as-yet unannotated genes or the “fuzzy” transcription of known genes that extends beyond the annotated boundaries. To elucidate the contributions of these two sources, we assemble a dataset of more than 2.5 billion publicly available RNA-seq reads across 5 human cell lines and multiple cellular compartments to annotate transcriptional units in the human genome. About 80% of transcripts from unannotated intergenic regions can be attributed to the fuzzy transcription of existing genes; the remaining transcripts originate mainly from putative long non-coding RNA loci that are rarely spliced. We validate the transcriptional activity of these intergenic RNAs using independent measurements, including transcriptional start sites, chromatin signatures, and genomic occupancies of RNA polymerase II in various phosphorylation states. We also analyze the nuclear localization and sensitivities of intergenic transcripts to nucleases to illustrate that they tend to be rapidly degraded either on-chromatin by XRN2 or off-chromatin by the exosome. CONCLUSIONS: We provide a curated atlas of intergenic RNAs that distinguishes between alternative processing of well-annotated genes from independent transcriptional units based on the combined analysis of chromatin signatures, nuclear RNA localization, and degradation pathways.
format Online
Article
Text
id pubmed-8097831
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80978312021-05-05 Intergenic RNA mainly derives from nascent transcripts of known genes Agostini, Federico Zagalak, Julian Attig, Jan Ule, Jernej Luscombe, Nicholas M. Genome Biol Research BACKGROUND: Eukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs originating from intergenic regions in the human genome remain unclear. RESULTS: We hypothesize that many intergenic RNAs can be ascribed to the presence of as-yet unannotated genes or the “fuzzy” transcription of known genes that extends beyond the annotated boundaries. To elucidate the contributions of these two sources, we assemble a dataset of more than 2.5 billion publicly available RNA-seq reads across 5 human cell lines and multiple cellular compartments to annotate transcriptional units in the human genome. About 80% of transcripts from unannotated intergenic regions can be attributed to the fuzzy transcription of existing genes; the remaining transcripts originate mainly from putative long non-coding RNA loci that are rarely spliced. We validate the transcriptional activity of these intergenic RNAs using independent measurements, including transcriptional start sites, chromatin signatures, and genomic occupancies of RNA polymerase II in various phosphorylation states. We also analyze the nuclear localization and sensitivities of intergenic transcripts to nucleases to illustrate that they tend to be rapidly degraded either on-chromatin by XRN2 or off-chromatin by the exosome. CONCLUSIONS: We provide a curated atlas of intergenic RNAs that distinguishes between alternative processing of well-annotated genes from independent transcriptional units based on the combined analysis of chromatin signatures, nuclear RNA localization, and degradation pathways. BioMed Central 2021-05-05 /pmc/articles/PMC8097831/ /pubmed/33952325 http://dx.doi.org/10.1186/s13059-021-02350-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Agostini, Federico
Zagalak, Julian
Attig, Jan
Ule, Jernej
Luscombe, Nicholas M.
Intergenic RNA mainly derives from nascent transcripts of known genes
title Intergenic RNA mainly derives from nascent transcripts of known genes
title_full Intergenic RNA mainly derives from nascent transcripts of known genes
title_fullStr Intergenic RNA mainly derives from nascent transcripts of known genes
title_full_unstemmed Intergenic RNA mainly derives from nascent transcripts of known genes
title_short Intergenic RNA mainly derives from nascent transcripts of known genes
title_sort intergenic rna mainly derives from nascent transcripts of known genes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097831/
https://www.ncbi.nlm.nih.gov/pubmed/33952325
http://dx.doi.org/10.1186/s13059-021-02350-x
work_keys_str_mv AT agostinifederico intergenicrnamainlyderivesfromnascenttranscriptsofknowngenes
AT zagalakjulian intergenicrnamainlyderivesfromnascenttranscriptsofknowngenes
AT attigjan intergenicrnamainlyderivesfromnascenttranscriptsofknowngenes
AT ulejernej intergenicrnamainlyderivesfromnascenttranscriptsofknowngenes
AT luscombenicholasm intergenicrnamainlyderivesfromnascenttranscriptsofknowngenes