Cargando…

Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution

The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences’ properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combini...

Descripción completa

Detalles Bibliográficos
Autores principales: Papadopoulos, Chris, Callebaut, Isabelle, Gelly, Jean-Christophe, Hatin, Isabelle, Namy, Olivier, Renard, Maxime, Lespinet, Olivier, Lopes, Anne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647833/
https://www.ncbi.nlm.nih.gov/pubmed/34810219
http://dx.doi.org/10.1101/gr.275638.121
_version_ 1784610677613133824
author Papadopoulos, Chris
Callebaut, Isabelle
Gelly, Jean-Christophe
Hatin, Isabelle
Namy, Olivier
Renard, Maxime
Lespinet, Olivier
Lopes, Anne
author_facet Papadopoulos, Chris
Callebaut, Isabelle
Gelly, Jean-Christophe
Hatin, Isabelle
Namy, Olivier
Renard, Maxime
Lespinet, Olivier
Lopes, Anne
author_sort Papadopoulos, Chris
collection PubMed
description The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences’ properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states’ diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
format Online
Article
Text
id pubmed-8647833
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-86478332022-06-01 Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution Papadopoulos, Chris Callebaut, Isabelle Gelly, Jean-Christophe Hatin, Isabelle Namy, Olivier Renard, Maxime Lespinet, Olivier Lopes, Anne Genome Res Research The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences’ properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states’ diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe. Cold Spring Harbor Laboratory Press 2021-12 /pmc/articles/PMC8647833/ /pubmed/34810219 http://dx.doi.org/10.1101/gr.275638.121 Text en © 2021 Papadopoulos et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Research
Papadopoulos, Chris
Callebaut, Isabelle
Gelly, Jean-Christophe
Hatin, Isabelle
Namy, Olivier
Renard, Maxime
Lespinet, Olivier
Lopes, Anne
Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution
title Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution
title_full Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution
title_fullStr Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution
title_full_unstemmed Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution
title_short Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution
title_sort intergenic orfs as elementary structural modules of de novo gene birth and protein evolution
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647833/
https://www.ncbi.nlm.nih.gov/pubmed/34810219
http://dx.doi.org/10.1101/gr.275638.121
work_keys_str_mv AT papadopouloschris intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT callebautisabelle intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT gellyjeanchristophe intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT hatinisabelle intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT namyolivier intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT renardmaxime intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT lespinetolivier intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution
AT lopesanne intergenicorfsaselementarystructuralmodulesofdenovogenebirthandproteinevolution