Cargando…
Swarm v2: highly-scalable and high-resolution amplicon clustering
Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs), free of arbitrary global clustering thresholds and input-order dependency. Swarm v1 worked with an initial phase that used iterative single-lin...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4690345/ https://www.ncbi.nlm.nih.gov/pubmed/26713226 http://dx.doi.org/10.7717/peerj.1420 |
_version_ | 1782407000283414528 |
---|---|
author | Mahé, Frédéric Rognes, Torbjørn Quince, Christopher de Vargas, Colomban Dunthorn, Micah |
author_facet | Mahé, Frédéric Rognes, Torbjørn Quince, Christopher de Vargas, Colomban Dunthorn, Micah |
author_sort | Mahé, Frédéric |
collection | PubMed |
description | Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs), free of arbitrary global clustering thresholds and input-order dependency. Swarm v1 worked with an initial phase that used iterative single-linkage with a local clustering threshold (d), followed by a phase that used the internal abundance structures of clusters to break chained OTUs. Here we present Swarm v2, which has two important novel features: (1) a new algorithm for d = 1 that allows the computation time of the program to scale linearly with increasing amounts of data; and (2) the new fastidious option that reduces under-grouping by grafting low abundant OTUs (e.g., singletons and doubletons) onto larger ones. Swarm v2 also directly integrates the clustering and breaking phases, dereplicates sequencing reads with d = 0, outputs OTU representatives in fasta format, and plots individual OTUs as two-dimensional networks. |
format | Online Article Text |
id | pubmed-4690345 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-46903452015-12-28 Swarm v2: highly-scalable and high-resolution amplicon clustering Mahé, Frédéric Rognes, Torbjørn Quince, Christopher de Vargas, Colomban Dunthorn, Micah PeerJ Biodiversity Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs), free of arbitrary global clustering thresholds and input-order dependency. Swarm v1 worked with an initial phase that used iterative single-linkage with a local clustering threshold (d), followed by a phase that used the internal abundance structures of clusters to break chained OTUs. Here we present Swarm v2, which has two important novel features: (1) a new algorithm for d = 1 that allows the computation time of the program to scale linearly with increasing amounts of data; and (2) the new fastidious option that reduces under-grouping by grafting low abundant OTUs (e.g., singletons and doubletons) onto larger ones. Swarm v2 also directly integrates the clustering and breaking phases, dereplicates sequencing reads with d = 0, outputs OTU representatives in fasta format, and plots individual OTUs as two-dimensional networks. PeerJ Inc. 2015-12-10 /pmc/articles/PMC4690345/ /pubmed/26713226 http://dx.doi.org/10.7717/peerj.1420 Text en © 2015 Mahé et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Biodiversity Mahé, Frédéric Rognes, Torbjørn Quince, Christopher de Vargas, Colomban Dunthorn, Micah Swarm v2: highly-scalable and high-resolution amplicon clustering |
title | Swarm v2: highly-scalable and high-resolution amplicon clustering |
title_full | Swarm v2: highly-scalable and high-resolution amplicon clustering |
title_fullStr | Swarm v2: highly-scalable and high-resolution amplicon clustering |
title_full_unstemmed | Swarm v2: highly-scalable and high-resolution amplicon clustering |
title_short | Swarm v2: highly-scalable and high-resolution amplicon clustering |
title_sort | swarm v2: highly-scalable and high-resolution amplicon clustering |
topic | Biodiversity |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4690345/ https://www.ncbi.nlm.nih.gov/pubmed/26713226 http://dx.doi.org/10.7717/peerj.1420 |
work_keys_str_mv | AT mahefrederic swarmv2highlyscalableandhighresolutionampliconclustering AT rognestorbjørn swarmv2highlyscalableandhighresolutionampliconclustering AT quincechristopher swarmv2highlyscalableandhighresolutionampliconclustering AT devargascolomban swarmv2highlyscalableandhighresolutionampliconclustering AT dunthornmicah swarmv2highlyscalableandhighresolutionampliconclustering |