Cargando…

Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage

In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA...

Descripción completa

Detalles Bibliográficos
Autores principales: Dowling, Daniel, Schmitz, Jonathan F, Bornberg-Bauer, Erich
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7674706/
https://www.ncbi.nlm.nih.gov/pubmed/33210146
http://dx.doi.org/10.1093/gbe/evaa194
_version_ 1783611561989898240
author Dowling, Daniel
Schmitz, Jonathan F
Bornberg-Bauer, Erich
author_facet Dowling, Daniel
Schmitz, Jonathan F
Bornberg-Bauer, Erich
author_sort Dowling, Daniel
collection PubMed
description In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity—which have been proposed to play a role in survival of de novo genes—remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.
format Online
Article
Text
id pubmed-7674706
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-76747062020-11-24 Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage Dowling, Daniel Schmitz, Jonathan F Bornberg-Bauer, Erich Genome Biol Evol Research Article In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity—which have been proposed to play a role in survival of de novo genes—remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection. Oxford University Press 2020-09-16 /pmc/articles/PMC7674706/ /pubmed/33210146 http://dx.doi.org/10.1093/gbe/evaa194 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research Article
Dowling, Daniel
Schmitz, Jonathan F
Bornberg-Bauer, Erich
Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage
title Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage
title_full Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage
title_fullStr Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage
title_full_unstemmed Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage
title_short Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage
title_sort stochastic gain and loss of novel transcribed open reading frames in the human lineage
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7674706/
https://www.ncbi.nlm.nih.gov/pubmed/33210146
http://dx.doi.org/10.1093/gbe/evaa194
work_keys_str_mv AT dowlingdaniel stochasticgainandlossofnoveltranscribedopenreadingframesinthehumanlineage
AT schmitzjonathanf stochasticgainandlossofnoveltranscribedopenreadingframesinthehumanlineage
AT bornbergbauererich stochasticgainandlossofnoveltranscribedopenreadingframesinthehumanlineage