Cargando…

A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila

Orphan genes, lacking detectable homologs in outgroup species, typically represent 10–30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene eme...

Descripción completa

Detalles Bibliográficos
Autores principales: Heames, Brennen, Schmitz, Jonathan, Bornberg-Bauer, Erich
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162840/
https://www.ncbi.nlm.nih.gov/pubmed/32253450
http://dx.doi.org/10.1007/s00239-020-09939-z
_version_ 1783523103916163072
author Heames, Brennen
Schmitz, Jonathan
Bornberg-Bauer, Erich
author_facet Heames, Brennen
Schmitz, Jonathan
Bornberg-Bauer, Erich
author_sort Heames, Brennen
collection PubMed
description Orphan genes, lacking detectable homologs in outgroup species, typically represent 10–30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7–39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s00239-020-09939-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-7162840
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-71628402020-04-23 A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila Heames, Brennen Schmitz, Jonathan Bornberg-Bauer, Erich J Mol Evol Original Article Orphan genes, lacking detectable homologs in outgroup species, typically represent 10–30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7–39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s00239-020-09939-z) contains supplementary material, which is available to authorized users. Springer US 2020-04-07 2020 /pmc/articles/PMC7162840/ /pubmed/32253450 http://dx.doi.org/10.1007/s00239-020-09939-z Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Original Article
Heames, Brennen
Schmitz, Jonathan
Bornberg-Bauer, Erich
A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
title A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
title_full A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
title_fullStr A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
title_full_unstemmed A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
title_short A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
title_sort continuum of evolving de novo genes drives protein-coding novelty in drosophila
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162840/
https://www.ncbi.nlm.nih.gov/pubmed/32253450
http://dx.doi.org/10.1007/s00239-020-09939-z
work_keys_str_mv AT heamesbrennen acontinuumofevolvingdenovogenesdrivesproteincodingnoveltyindrosophila
AT schmitzjonathan acontinuumofevolvingdenovogenesdrivesproteincodingnoveltyindrosophila
AT bornbergbauererich acontinuumofevolvingdenovogenesdrivesproteincodingnoveltyindrosophila
AT heamesbrennen continuumofevolvingdenovogenesdrivesproteincodingnoveltyindrosophila
AT schmitzjonathan continuumofevolvingdenovogenesdrivesproteincodingnoveltyindrosophila
AT bornbergbauererich continuumofevolvingdenovogenesdrivesproteincodingnoveltyindrosophila