Cargando…

CAMITAX: Taxon labels for microbial genomes

BACKGROUND: The number of microbial genome sequences is increasing exponentially, especially thanks to recent advances in recovering complete or near-complete genomes from metagenomes and single cells. Assigning reliable taxon labels to genomes is key and often a prerequisite for downstream analyses...

Descripción completa

Detalles Bibliográficos
Autores principales: Bremges, Andreas, Fritz, Adrian, McHardy, Alice C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6946028/
https://www.ncbi.nlm.nih.gov/pubmed/31909794
http://dx.doi.org/10.1093/gigascience/giz154
_version_ 1783485279439421440
author Bremges, Andreas
Fritz, Adrian
McHardy, Alice C
author_facet Bremges, Andreas
Fritz, Adrian
McHardy, Alice C
author_sort Bremges, Andreas
collection PubMed
description BACKGROUND: The number of microbial genome sequences is increasing exponentially, especially thanks to recent advances in recovering complete or near-complete genomes from metagenomes and single cells. Assigning reliable taxon labels to genomes is key and often a prerequisite for downstream analyses. FINDINGS: We introduce CAMITAX, a scalable and reproducible workflow for the taxonomic labelling of microbial genomes recovered from isolates, single cells, and metagenomes. CAMITAX combines genome distance–, 16S ribosomal RNA gene–, and gene homology–based taxonomic assignments with phylogenetic placement. It uses Nextflow to orchestrate reference databases and software containers and thus combines ease of installation and use with computational reproducibility. We evaluated the method on several hundred metagenome-assembled genomes with high-quality taxonomic annotations from the TARA Oceans project, and we show that the ensemble classification method in CAMITAX improved on all individual methods across tested ranks. CONCLUSIONS: While we initially developed CAMITAX to aid the Critical Assessment of Metagenome Interpretation (CAMI) initiative, it evolved into a comprehensive software package to reliably assign taxon labels to microbial genomes. CAMITAX is available under Apache License 2.0 at https://github.com/CAMI-challenge/CAMITAX.
format Online
Article
Text
id pubmed-6946028
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-69460282020-01-09 CAMITAX: Taxon labels for microbial genomes Bremges, Andreas Fritz, Adrian McHardy, Alice C Gigascience Technical Note BACKGROUND: The number of microbial genome sequences is increasing exponentially, especially thanks to recent advances in recovering complete or near-complete genomes from metagenomes and single cells. Assigning reliable taxon labels to genomes is key and often a prerequisite for downstream analyses. FINDINGS: We introduce CAMITAX, a scalable and reproducible workflow for the taxonomic labelling of microbial genomes recovered from isolates, single cells, and metagenomes. CAMITAX combines genome distance–, 16S ribosomal RNA gene–, and gene homology–based taxonomic assignments with phylogenetic placement. It uses Nextflow to orchestrate reference databases and software containers and thus combines ease of installation and use with computational reproducibility. We evaluated the method on several hundred metagenome-assembled genomes with high-quality taxonomic annotations from the TARA Oceans project, and we show that the ensemble classification method in CAMITAX improved on all individual methods across tested ranks. CONCLUSIONS: While we initially developed CAMITAX to aid the Critical Assessment of Metagenome Interpretation (CAMI) initiative, it evolved into a comprehensive software package to reliably assign taxon labels to microbial genomes. CAMITAX is available under Apache License 2.0 at https://github.com/CAMI-challenge/CAMITAX. Oxford University Press 2020-01-07 /pmc/articles/PMC6946028/ /pubmed/31909794 http://dx.doi.org/10.1093/gigascience/giz154 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Bremges, Andreas
Fritz, Adrian
McHardy, Alice C
CAMITAX: Taxon labels for microbial genomes
title CAMITAX: Taxon labels for microbial genomes
title_full CAMITAX: Taxon labels for microbial genomes
title_fullStr CAMITAX: Taxon labels for microbial genomes
title_full_unstemmed CAMITAX: Taxon labels for microbial genomes
title_short CAMITAX: Taxon labels for microbial genomes
title_sort camitax: taxon labels for microbial genomes
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6946028/
https://www.ncbi.nlm.nih.gov/pubmed/31909794
http://dx.doi.org/10.1093/gigascience/giz154
work_keys_str_mv AT bremgesandreas camitaxtaxonlabelsformicrobialgenomes
AT fritzadrian camitaxtaxonlabelsformicrobialgenomes
AT mchardyalicec camitaxtaxonlabelsformicrobialgenomes