Cargando…

Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies

Microbial community metagenomes and individual microbial genomes are becoming increasingly accessible by means of high-throughput sequencing. Assessing organismal membership within a community is typically performed using one or a few taxonomic marker genes such as the 16S rDNA, and these same genes...

Descripción completa

Detalles Bibliográficos
Autores principales: Segata, Nicola, Huttenhower, Curtis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171473/
https://www.ncbi.nlm.nih.gov/pubmed/21931822
http://dx.doi.org/10.1371/journal.pone.0024704
_version_ 1782211769795608576
author Segata, Nicola
Huttenhower, Curtis
author_facet Segata, Nicola
Huttenhower, Curtis
author_sort Segata, Nicola
collection PubMed
description Microbial community metagenomes and individual microbial genomes are becoming increasingly accessible by means of high-throughput sequencing. Assessing organismal membership within a community is typically performed using one or a few taxonomic marker genes such as the 16S rDNA, and these same genes are also employed to reconstruct molecular phylogenies. There is thus a growing need to bioinformatically catalog strongly conserved core genes that can serve as effective taxonomic markers, to assess the agreement among phylogenies generated from different core gene, and to characterize the biological functions enriched within core genes and thus conserved throughout large microbial clades. We present a method to recursively identify core genes (i.e. genes ubiquitous within a microbial clade) in high-throughput from a large number of complete input genomes. We analyzed over 1,100 genomes to produce core gene sets spanning 2,861 bacterial and archaeal clades, ranging in size from one to >2,000 genes in inverse correlation with the α-diversity (total phylogenetic branch length) spanned by each clade. These cores are enriched as expected for housekeeping functions including translation, transcription, and replication, in addition to significant representations of regulatory, chaperone, and conserved uncharacterized proteins. In agreement with previous manually curated core gene sets, phylogenies constructed from one or more of these core genes agree with those built using 16S rDNA sequence similarity, suggesting that systematic core gene selection can be used to optimize both comparative genomics and determination of microbial community structure. Finally, we examine functional phylogenies constructed by clustering genomes by the presence or absence of orthologous gene families and show that they provide an informative complement to standard sequence-based molecular phylogenies.
format Online
Article
Text
id pubmed-3171473
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31714732011-09-19 Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies Segata, Nicola Huttenhower, Curtis PLoS One Research Article Microbial community metagenomes and individual microbial genomes are becoming increasingly accessible by means of high-throughput sequencing. Assessing organismal membership within a community is typically performed using one or a few taxonomic marker genes such as the 16S rDNA, and these same genes are also employed to reconstruct molecular phylogenies. There is thus a growing need to bioinformatically catalog strongly conserved core genes that can serve as effective taxonomic markers, to assess the agreement among phylogenies generated from different core gene, and to characterize the biological functions enriched within core genes and thus conserved throughout large microbial clades. We present a method to recursively identify core genes (i.e. genes ubiquitous within a microbial clade) in high-throughput from a large number of complete input genomes. We analyzed over 1,100 genomes to produce core gene sets spanning 2,861 bacterial and archaeal clades, ranging in size from one to >2,000 genes in inverse correlation with the α-diversity (total phylogenetic branch length) spanned by each clade. These cores are enriched as expected for housekeeping functions including translation, transcription, and replication, in addition to significant representations of regulatory, chaperone, and conserved uncharacterized proteins. In agreement with previous manually curated core gene sets, phylogenies constructed from one or more of these core genes agree with those built using 16S rDNA sequence similarity, suggesting that systematic core gene selection can be used to optimize both comparative genomics and determination of microbial community structure. Finally, we examine functional phylogenies constructed by clustering genomes by the presence or absence of orthologous gene families and show that they provide an informative complement to standard sequence-based molecular phylogenies. Public Library of Science 2011-09-12 /pmc/articles/PMC3171473/ /pubmed/21931822 http://dx.doi.org/10.1371/journal.pone.0024704 Text en Segata, Huttenhower. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Segata, Nicola
Huttenhower, Curtis
Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies
title Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies
title_full Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies
title_fullStr Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies
title_full_unstemmed Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies
title_short Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies
title_sort toward an efficient method of identifying core genes for evolutionary and functional microbial phylogenies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171473/
https://www.ncbi.nlm.nih.gov/pubmed/21931822
http://dx.doi.org/10.1371/journal.pone.0024704
work_keys_str_mv AT segatanicola towardanefficientmethodofidentifyingcoregenesforevolutionaryandfunctionalmicrobialphylogenies
AT huttenhowercurtis towardanefficientmethodofidentifyingcoregenesforevolutionaryandfunctionalmicrobialphylogenies