Cargando…

mTAGs: taxonomic profiling using degenerate consensus reference sequences of ribosomal RNA genes

 : Profiling the taxonomic composition of microbial communities commonly involves the classification of ribosomal RNA gene fragments. As a trade-off to maintain high classification accuracy, existing tools are typically limited to the genus level. Here, we present mTAGs, a taxonomic profiling tool t...

Descripción completa

Detalles Bibliográficos
Autores principales: Salazar, Guillem, Ruscheweyh, Hans-Joachim, Hildebrand, Falk, Acinas, Silvia G, Sunagawa, Shinichi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696115/
https://www.ncbi.nlm.nih.gov/pubmed/34260698
http://dx.doi.org/10.1093/bioinformatics/btab465
Descripción
Sumario: : Profiling the taxonomic composition of microbial communities commonly involves the classification of ribosomal RNA gene fragments. As a trade-off to maintain high classification accuracy, existing tools are typically limited to the genus level. Here, we present mTAGs, a taxonomic profiling tool that implements the alignment of metagenomic sequencing reads to degenerate consensus reference sequences of small subunit ribosomal RNA genes. It uses DNA fragments, that is, paired-end sequencing reads, as count units and provides relative abundance profiles at multiple taxonomic ranks, including operational taxonomic units based on a 97% sequence identity cutoff. At the genus rank, mTAGs outperformed other tools across several metrics, such as the F(1) score by >11% across data from different environments, and achieved competitive (F(1) score) or better results (Bray–Curtis dissimilarity) at the sub-genus level. AVAILABILITY AND IMPLEMENTATION: The software tool mTAGs is implemented in Python. The source code and binaries are freely available (https://github.com/SushiLab/mTAGs). The data underlying this article are available in Zenodo, at https://doi.org/10.5281/zenodo.4352762. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.