Cargando…

A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes

BACKGROUND: Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. RESULTS:...

Descripción completa

Detalles Bibliográficos
Autores principales: Hezroni, Hadas, Ben-Tov Perry, Rotem, Meir, Zohar, Housman, Gali, Lubelsky, Yoav, Ulitsky, Igor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5577775/
https://www.ncbi.nlm.nih.gov/pubmed/28854954
http://dx.doi.org/10.1186/s13059-017-1293-0
_version_ 1783260413071196160
author Hezroni, Hadas
Ben-Tov Perry, Rotem
Meir, Zohar
Housman, Gali
Lubelsky, Yoav
Ulitsky, Igor
author_facet Hezroni, Hadas
Ben-Tov Perry, Rotem
Meir, Zohar
Housman, Gali
Lubelsky, Yoav
Ulitsky, Igor
author_sort Hezroni, Hadas
collection PubMed
description BACKGROUND: Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. RESULTS: We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. CONCLUSIONS: We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-017-1293-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5577775
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-55777752017-08-31 A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes Hezroni, Hadas Ben-Tov Perry, Rotem Meir, Zohar Housman, Gali Lubelsky, Yoav Ulitsky, Igor Genome Biol Research BACKGROUND: Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. RESULTS: We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. CONCLUSIONS: We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-017-1293-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-08-30 /pmc/articles/PMC5577775/ /pubmed/28854954 http://dx.doi.org/10.1186/s13059-017-1293-0 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Hezroni, Hadas
Ben-Tov Perry, Rotem
Meir, Zohar
Housman, Gali
Lubelsky, Yoav
Ulitsky, Igor
A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_full A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_fullStr A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_full_unstemmed A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_short A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_sort subset of conserved mammalian long non-coding rnas are fossils of ancestral protein-coding genes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5577775/
https://www.ncbi.nlm.nih.gov/pubmed/28854954
http://dx.doi.org/10.1186/s13059-017-1293-0
work_keys_str_mv AT hezronihadas asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT bentovperryrotem asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT meirzohar asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT housmangali asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT lubelskyyoav asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT ulitskyigor asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT hezronihadas subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT bentovperryrotem subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT meirzohar subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT housmangali subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT lubelskyyoav subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT ulitskyigor subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes