Cargando…

Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein reference database, such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipel...

Descripción completa

Detalles Bibliográficos
Autores principales: Gautam, Anupam, Felderhoff, Hendrik, Bağci, Caner, Huson, Daniel H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8862659/
https://www.ncbi.nlm.nih.gov/pubmed/35191776
http://dx.doi.org/10.1128/msystems.01408-21
_version_ 1784655093188001792
author Gautam, Anupam
Felderhoff, Hendrik
Bağci, Caner
Huson, Daniel H.
author_facet Gautam, Anupam
Felderhoff, Hendrik
Bağci, Caner
Huson, Daniel H.
author_sort Gautam, Anupam
collection PubMed
description In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein reference database, such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipeline, which first aligns reads against NCBI-nr using DIAMOND and then performs taxonomic and functional binning using MEGAN. Here, we propose the use of the AnnoTree protein database, rather than NCBI-nr, in such alignment-based analyses to determine the prokaryotic content of metagenomic samples. We demonstrate a 2-fold speedup over the usage of the prokaryotic part of NCBI-nr and increased assignment rates, in particular assigning twice as many reads to KEGG. In addition to binning to the NCBI taxonomy, MEGAN now also bins to the GTDB taxonomy. IMPORTANCE The NCBI-nr database is not explicitly designed for the purpose of microbiome analysis, and its increasing size makes its unwieldy and computationally expensive for this purpose. The AnnoTree protein database is only one-quarter the size of the full NCBI-nr database and is explicitly designed for metagenomic analysis, so it should be supported by alignment-based pipelines.
format Online
Article
Text
id pubmed-8862659
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-88626592022-03-03 Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis Gautam, Anupam Felderhoff, Hendrik Bağci, Caner Huson, Daniel H. mSystems Research Article In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein reference database, such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipeline, which first aligns reads against NCBI-nr using DIAMOND and then performs taxonomic and functional binning using MEGAN. Here, we propose the use of the AnnoTree protein database, rather than NCBI-nr, in such alignment-based analyses to determine the prokaryotic content of metagenomic samples. We demonstrate a 2-fold speedup over the usage of the prokaryotic part of NCBI-nr and increased assignment rates, in particular assigning twice as many reads to KEGG. In addition to binning to the NCBI taxonomy, MEGAN now also bins to the GTDB taxonomy. IMPORTANCE The NCBI-nr database is not explicitly designed for the purpose of microbiome analysis, and its increasing size makes its unwieldy and computationally expensive for this purpose. The AnnoTree protein database is only one-quarter the size of the full NCBI-nr database and is explicitly designed for metagenomic analysis, so it should be supported by alignment-based pipelines. American Society for Microbiology 2022-02-22 /pmc/articles/PMC8862659/ /pubmed/35191776 http://dx.doi.org/10.1128/msystems.01408-21 Text en Copyright © 2022 Gautam et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Gautam, Anupam
Felderhoff, Hendrik
Bağci, Caner
Huson, Daniel H.
Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
title Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
title_full Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
title_fullStr Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
title_full_unstemmed Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
title_short Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
title_sort using annotree to get more assignments, faster, in diamond+megan microbiome analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8862659/
https://www.ncbi.nlm.nih.gov/pubmed/35191776
http://dx.doi.org/10.1128/msystems.01408-21
work_keys_str_mv AT gautamanupam usingannotreetogetmoreassignmentsfasterindiamondmeganmicrobiomeanalysis
AT felderhoffhendrik usingannotreetogetmoreassignmentsfasterindiamondmeganmicrobiomeanalysis
AT bagcicaner usingannotreetogetmoreassignmentsfasterindiamondmeganmicrobiomeanalysis
AT husondanielh usingannotreetogetmoreassignmentsfasterindiamondmeganmicrobiomeanalysis