Cargando…

Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome

All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and rece...

Descripción completa

Detalles Bibliográficos
Autores principales: Watson, Andrew K, Lopez, Philippe, Bapteste, Eric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8788219/
https://www.ncbi.nlm.nih.gov/pubmed/34792602
http://dx.doi.org/10.1093/molbev/msab329
_version_ 1784639512526192640
author Watson, Andrew K
Lopez, Philippe
Bapteste, Eric
author_facet Watson, Andrew K
Lopez, Philippe
Bapteste, Eric
author_sort Watson, Andrew K
collection PubMed
description All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and recent work has begun to elucidate the processes by which de novo gene formation can occur. A special case of de novo gene formation, overprinting, describes the origin of new genes from noncoding alternative reading frames of existing open reading frames (ORFs). We argue that additionally, out-of-frame gene fission/fusion events of alternative reading frames of ORFs and out-of-frame lateral gene transfers could contribute to the origin of new gene families. To demonstrate this, we developed an original pattern-search in sequence similarity networks, enhancing the use of these graphs, commonly used to detect in-frame remodeled genes. We applied this approach to gene families in 524 complete genomes of Escherichia coli. We identified 767 gene families whose evolutionary history likely included at least one out-of-frame remodeling event. These genes with out-of-frame components represent ∼2.5% of all genes in the E. coli pangenome, suggesting that alternative reading frames of existing ORFs can contribute to a significant proportion of de novo genes in bacteria.
format Online
Article
Text
id pubmed-8788219
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87882192022-01-26 Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome Watson, Andrew K Lopez, Philippe Bapteste, Eric Mol Biol Evol Discoveries All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and recent work has begun to elucidate the processes by which de novo gene formation can occur. A special case of de novo gene formation, overprinting, describes the origin of new genes from noncoding alternative reading frames of existing open reading frames (ORFs). We argue that additionally, out-of-frame gene fission/fusion events of alternative reading frames of ORFs and out-of-frame lateral gene transfers could contribute to the origin of new gene families. To demonstrate this, we developed an original pattern-search in sequence similarity networks, enhancing the use of these graphs, commonly used to detect in-frame remodeled genes. We applied this approach to gene families in 524 complete genomes of Escherichia coli. We identified 767 gene families whose evolutionary history likely included at least one out-of-frame remodeling event. These genes with out-of-frame components represent ∼2.5% of all genes in the E. coli pangenome, suggesting that alternative reading frames of existing ORFs can contribute to a significant proportion of de novo genes in bacteria. Oxford University Press 2021-11-18 /pmc/articles/PMC8788219/ /pubmed/34792602 http://dx.doi.org/10.1093/molbev/msab329 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Discoveries
Watson, Andrew K
Lopez, Philippe
Bapteste, Eric
Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome
title Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome
title_full Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome
title_fullStr Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome
title_full_unstemmed Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome
title_short Hundreds of Out-of-Frame Remodeled Gene Families in the Escherichia coli Pangenome
title_sort hundreds of out-of-frame remodeled gene families in the escherichia coli pangenome
topic Discoveries
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8788219/
https://www.ncbi.nlm.nih.gov/pubmed/34792602
http://dx.doi.org/10.1093/molbev/msab329
work_keys_str_mv AT watsonandrewk hundredsofoutofframeremodeledgenefamiliesintheescherichiacolipangenome
AT lopezphilippe hundredsofoutofframeremodeledgenefamiliesintheescherichiacolipangenome
AT baptesteeric hundredsofoutofframeremodeledgenefamiliesintheescherichiacolipangenome