Cargando…

Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT

Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent novel de...

Descripción completa

Detalles Bibliográficos
Autores principales: von Meijenfeldt, F. A. Bastiaan, Arkhipova, Ksenia, Cambuy, Diego D., Coutinho, Felipe H., Dutilh, Bas E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6805573/
https://www.ncbi.nlm.nih.gov/pubmed/31640809
http://dx.doi.org/10.1186/s13059-019-1817-x
_version_ 1783461418991878144
author von Meijenfeldt, F. A. Bastiaan
Arkhipova, Ksenia
Cambuy, Diego D.
Coutinho, Felipe H.
Dutilh, Bas E.
author_facet von Meijenfeldt, F. A. Bastiaan
Arkhipova, Ksenia
Cambuy, Diego D.
Coutinho, Felipe H.
Dutilh, Bas E.
author_sort von Meijenfeldt, F. A. Bastiaan
collection PubMed
description Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent novel deep lineages. We present a classification method that integrates multiple signals to classify sequences (Contig Annotation Tool, CAT) and metagenome-assembled genomes (Bin Annotation Tool, BAT). Classifications are automatically made at low taxonomic ranks if closely related organisms are present in the reference database and at higher ranks otherwise. The result is a high classification precision even for sequences from considerably unknown organisms.
format Online
Article
Text
id pubmed-6805573
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68055732019-10-24 Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT von Meijenfeldt, F. A. Bastiaan Arkhipova, Ksenia Cambuy, Diego D. Coutinho, Felipe H. Dutilh, Bas E. Genome Biol Method Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent novel deep lineages. We present a classification method that integrates multiple signals to classify sequences (Contig Annotation Tool, CAT) and metagenome-assembled genomes (Bin Annotation Tool, BAT). Classifications are automatically made at low taxonomic ranks if closely related organisms are present in the reference database and at higher ranks otherwise. The result is a high classification precision even for sequences from considerably unknown organisms. BioMed Central 2019-10-22 /pmc/articles/PMC6805573/ /pubmed/31640809 http://dx.doi.org/10.1186/s13059-019-1817-x Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Method
von Meijenfeldt, F. A. Bastiaan
Arkhipova, Ksenia
Cambuy, Diego D.
Coutinho, Felipe H.
Dutilh, Bas E.
Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_full Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_fullStr Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_full_unstemmed Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_short Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_sort robust taxonomic classification of uncharted microbial sequences and bins with cat and bat
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6805573/
https://www.ncbi.nlm.nih.gov/pubmed/31640809
http://dx.doi.org/10.1186/s13059-019-1817-x
work_keys_str_mv AT vonmeijenfeldtfabastiaan robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT arkhipovaksenia robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT cambuydiegod robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT coutinhofelipeh robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT dutilhbase robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat