Cargando…

Swarm v3: towards tera-scale amplicon clustering

MOTIVATION: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahé, Frédéric, Czech, Lucas, Stamatakis, Alexandros, Quince, Christopher, de Vargas, Colomban, Dunthorn, Micah, Rognes, Torbjørn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696092/
https://www.ncbi.nlm.nih.gov/pubmed/34244702
http://dx.doi.org/10.1093/bioinformatics/btab493
_version_ 1784619729541922816
author Mahé, Frédéric
Czech, Lucas
Stamatakis, Alexandros
Quince, Christopher
de Vargas, Colomban
Dunthorn, Micah
Rognes, Torbjørn
author_facet Mahé, Frédéric
Czech, Lucas
Stamatakis, Alexandros
Quince, Christopher
de Vargas, Colomban
Dunthorn, Micah
Rognes, Torbjørn
author_sort Mahé, Frédéric
collection PubMed
description MOTIVATION: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes. RESULTS: When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are available at https://github.com/torognes/swarm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8696092
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-86960922022-01-04 Swarm v3: towards tera-scale amplicon clustering Mahé, Frédéric Czech, Lucas Stamatakis, Alexandros Quince, Christopher de Vargas, Colomban Dunthorn, Micah Rognes, Torbjørn Bioinformatics Applications Notes MOTIVATION: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes. RESULTS: When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are available at https://github.com/torognes/swarm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-09 /pmc/articles/PMC8696092/ /pubmed/34244702 http://dx.doi.org/10.1093/bioinformatics/btab493 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
Mahé, Frédéric
Czech, Lucas
Stamatakis, Alexandros
Quince, Christopher
de Vargas, Colomban
Dunthorn, Micah
Rognes, Torbjørn
Swarm v3: towards tera-scale amplicon clustering
title Swarm v3: towards tera-scale amplicon clustering
title_full Swarm v3: towards tera-scale amplicon clustering
title_fullStr Swarm v3: towards tera-scale amplicon clustering
title_full_unstemmed Swarm v3: towards tera-scale amplicon clustering
title_short Swarm v3: towards tera-scale amplicon clustering
title_sort swarm v3: towards tera-scale amplicon clustering
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696092/
https://www.ncbi.nlm.nih.gov/pubmed/34244702
http://dx.doi.org/10.1093/bioinformatics/btab493
work_keys_str_mv AT mahefrederic swarmv3towardsterascaleampliconclustering
AT czechlucas swarmv3towardsterascaleampliconclustering
AT stamatakisalexandros swarmv3towardsterascaleampliconclustering
AT quincechristopher swarmv3towardsterascaleampliconclustering
AT devargascolomban swarmv3towardsterascaleampliconclustering
AT dunthornmicah swarmv3towardsterascaleampliconclustering
AT rognestorbjørn swarmv3towardsterascaleampliconclustering