Cargando…

Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy

We introduce the operational genomic unit (OGU) method, a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach i...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Qiyun, Huang, Shi, Gonzalez, Antonio, McGrath, Imran, McDonald, Daniel, Haiminen, Niina, Armstrong, George, Vázquez-Baeza, Yoshiki, Yu, Julian, Kuczynski, Justin, Sepich-Poore, Gregory D., Swafford, Austin D., Das, Promi, Shaffer, Justin P., Lejzerowicz, Franck, Belda-Ferre, Pedro, Havulinna, Aki S., Méric, Guillaume, Niiranen, Teemu, Lahti, Leo, Salomaa, Veikko, Kim, Ho-Cheol, Jain, Mohit, Inouye, Michael, Gilbert, Jack A., Knight, Rob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040630/
https://www.ncbi.nlm.nih.gov/pubmed/35369727
http://dx.doi.org/10.1128/msystems.00167-22
_version_ 1784694373988958208
author Zhu, Qiyun
Huang, Shi
Gonzalez, Antonio
McGrath, Imran
McDonald, Daniel
Haiminen, Niina
Armstrong, George
Vázquez-Baeza, Yoshiki
Yu, Julian
Kuczynski, Justin
Sepich-Poore, Gregory D.
Swafford, Austin D.
Das, Promi
Shaffer, Justin P.
Lejzerowicz, Franck
Belda-Ferre, Pedro
Havulinna, Aki S.
Méric, Guillaume
Niiranen, Teemu
Lahti, Leo
Salomaa, Veikko
Kim, Ho-Cheol
Jain, Mohit
Inouye, Michael
Gilbert, Jack A.
Knight, Rob
author_facet Zhu, Qiyun
Huang, Shi
Gonzalez, Antonio
McGrath, Imran
McDonald, Daniel
Haiminen, Niina
Armstrong, George
Vázquez-Baeza, Yoshiki
Yu, Julian
Kuczynski, Justin
Sepich-Poore, Gregory D.
Swafford, Austin D.
Das, Promi
Shaffer, Justin P.
Lejzerowicz, Franck
Belda-Ferre, Pedro
Havulinna, Aki S.
Méric, Guillaume
Niiranen, Teemu
Lahti, Leo
Salomaa, Veikko
Kim, Ho-Cheol
Jain, Mohit
Inouye, Michael
Gilbert, Jack A.
Knight, Rob
author_sort Zhu, Qiyun
collection PubMed
description We introduce the operational genomic unit (OGU) method, a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach is independent of taxonomic classification, granting the possibility of maximal resolution of community composition, and organizes features into an accurate hierarchy using a phylogenomic tree. The outputs are suitable for contemporary analytical protocols for community ecology, differential abundance, and supervised learning while supporting phylogenetic methods, such as UniFrac and phylofactorization, that are seldom applied to shotgun metagenomics despite being prevalent in 16S rRNA gene amplicon studies. As demonstrated in two real-world case studies, the OGU method produces biologically meaningful patterns from microbiome data sets. Such patterns further remain detectable at very low metagenomic sequencing depths. Compared with taxonomic unit-based analyses implemented in currently adopted metagenomics tools, and the analysis of 16S rRNA gene amplicon sequence variants, this method shows superiority in informing biologically relevant insights, including stronger correlation with body environment and host sex on the Human Microbiome Project data set and more accurate prediction of human age by the gut microbiomes of Finnish individuals included in the FINRISK 2002 cohort. We provide Woltka, a bioinformatics tool to implement this method, with full integration with the QIIME 2 package and the Qiita web platform, to facilitate adoption of the OGU method in future metagenomics studies. IMPORTANCE Shotgun metagenomics is a powerful, yet computationally challenging, technique compared to 16S rRNA gene amplicon sequencing for decoding the composition and structure of microbial communities. Current analyses of metagenomic data are primarily based on taxonomic classification, which is limited in feature resolution. To solve these challenges, we introduce operational genomic units (OGUs), which are the individual reference genomes derived from sequence alignment results, without further assigning them taxonomy. The OGU method advances current read-based metagenomics in two dimensions: (i) providing maximal resolution of community composition and (ii) permitting use of phylogeny-aware tools. Our analysis of real-world data sets shows that it is advantageous over currently adopted metagenomic analysis methods and the finest-grained 16S rRNA analysis methods in predicting biological traits. We thus propose the adoption of OGUs as an effective practice in metagenomic studies.
format Online
Article
Text
id pubmed-9040630
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-90406302022-04-27 Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy Zhu, Qiyun Huang, Shi Gonzalez, Antonio McGrath, Imran McDonald, Daniel Haiminen, Niina Armstrong, George Vázquez-Baeza, Yoshiki Yu, Julian Kuczynski, Justin Sepich-Poore, Gregory D. Swafford, Austin D. Das, Promi Shaffer, Justin P. Lejzerowicz, Franck Belda-Ferre, Pedro Havulinna, Aki S. Méric, Guillaume Niiranen, Teemu Lahti, Leo Salomaa, Veikko Kim, Ho-Cheol Jain, Mohit Inouye, Michael Gilbert, Jack A. Knight, Rob mSystems Research Article We introduce the operational genomic unit (OGU) method, a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach is independent of taxonomic classification, granting the possibility of maximal resolution of community composition, and organizes features into an accurate hierarchy using a phylogenomic tree. The outputs are suitable for contemporary analytical protocols for community ecology, differential abundance, and supervised learning while supporting phylogenetic methods, such as UniFrac and phylofactorization, that are seldom applied to shotgun metagenomics despite being prevalent in 16S rRNA gene amplicon studies. As demonstrated in two real-world case studies, the OGU method produces biologically meaningful patterns from microbiome data sets. Such patterns further remain detectable at very low metagenomic sequencing depths. Compared with taxonomic unit-based analyses implemented in currently adopted metagenomics tools, and the analysis of 16S rRNA gene amplicon sequence variants, this method shows superiority in informing biologically relevant insights, including stronger correlation with body environment and host sex on the Human Microbiome Project data set and more accurate prediction of human age by the gut microbiomes of Finnish individuals included in the FINRISK 2002 cohort. We provide Woltka, a bioinformatics tool to implement this method, with full integration with the QIIME 2 package and the Qiita web platform, to facilitate adoption of the OGU method in future metagenomics studies. IMPORTANCE Shotgun metagenomics is a powerful, yet computationally challenging, technique compared to 16S rRNA gene amplicon sequencing for decoding the composition and structure of microbial communities. Current analyses of metagenomic data are primarily based on taxonomic classification, which is limited in feature resolution. To solve these challenges, we introduce operational genomic units (OGUs), which are the individual reference genomes derived from sequence alignment results, without further assigning them taxonomy. The OGU method advances current read-based metagenomics in two dimensions: (i) providing maximal resolution of community composition and (ii) permitting use of phylogeny-aware tools. Our analysis of real-world data sets shows that it is advantageous over currently adopted metagenomic analysis methods and the finest-grained 16S rRNA analysis methods in predicting biological traits. We thus propose the adoption of OGUs as an effective practice in metagenomic studies. American Society for Microbiology 2022-04-04 /pmc/articles/PMC9040630/ /pubmed/35369727 http://dx.doi.org/10.1128/msystems.00167-22 Text en Copyright © 2022 Zhu et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Zhu, Qiyun
Huang, Shi
Gonzalez, Antonio
McGrath, Imran
McDonald, Daniel
Haiminen, Niina
Armstrong, George
Vázquez-Baeza, Yoshiki
Yu, Julian
Kuczynski, Justin
Sepich-Poore, Gregory D.
Swafford, Austin D.
Das, Promi
Shaffer, Justin P.
Lejzerowicz, Franck
Belda-Ferre, Pedro
Havulinna, Aki S.
Méric, Guillaume
Niiranen, Teemu
Lahti, Leo
Salomaa, Veikko
Kim, Ho-Cheol
Jain, Mohit
Inouye, Michael
Gilbert, Jack A.
Knight, Rob
Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy
title Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy
title_full Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy
title_fullStr Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy
title_full_unstemmed Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy
title_short Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy
title_sort phylogeny-aware analysis of metagenome community ecology based on matched reference genomes while bypassing taxonomy
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040630/
https://www.ncbi.nlm.nih.gov/pubmed/35369727
http://dx.doi.org/10.1128/msystems.00167-22
work_keys_str_mv AT zhuqiyun phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT huangshi phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT gonzalezantonio phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT mcgrathimran phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT mcdonalddaniel phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT haiminenniina phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT armstronggeorge phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT vazquezbaezayoshiki phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT yujulian phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT kuczynskijustin phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT sepichpooregregoryd phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT swaffordaustind phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT daspromi phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT shafferjustinp phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT lejzerowiczfranck phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT beldaferrepedro phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT havulinnaakis phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT mericguillaume phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT niiranenteemu phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT lahtileo phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT salomaaveikko phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT kimhocheol phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT jainmohit phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT inouyemichael phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT gilbertjacka phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy
AT knightrob phylogenyawareanalysisofmetagenomecommunityecologybasedonmatchedreferencegenomeswhilebypassingtaxonomy