Cargando…

Swarm v3: towards tera-scale amplicon clustering

MOTIVATION: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahé, Frédéric, Czech, Lucas, Stamatakis, Alexandros, Quince, Christopher, de Vargas, Colomban, Dunthorn, Micah, Rognes, Torbjørn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696092/
https://www.ncbi.nlm.nih.gov/pubmed/34244702
http://dx.doi.org/10.1093/bioinformatics/btab493
Descripción
Sumario:MOTIVATION: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes. RESULTS: When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are available at https://github.com/torognes/swarm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.