Cargando…
cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
MOTIVATION: The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612831/ https://www.ncbi.nlm.nih.gov/pubmed/31510642 http://dx.doi.org/10.1093/bioinformatics/btz349 |
_version_ | 1783432946836832256 |
---|---|
author | Tolstoganov, Ivan Bankevich, Anton Chen, Zhoutao Pevzner, Pavel A |
author_facet | Tolstoganov, Ivan Bankevich, Anton Chen, Zhoutao Pevzner, Pavel A |
author_sort | Tolstoganov, Ivan |
collection | PubMed |
description | MOTIVATION: The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. RESULTS: We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. AVAILABILITY AND IMPLEMENTATION: Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6612831 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-66128312019-07-12 cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs Tolstoganov, Ivan Bankevich, Anton Chen, Zhoutao Pevzner, Pavel A Bioinformatics Ismb/Eccb 2019 Conference Proceedings MOTIVATION: The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. RESULTS: We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. AVAILABILITY AND IMPLEMENTATION: Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-07 2019-07-05 /pmc/articles/PMC6612831/ /pubmed/31510642 http://dx.doi.org/10.1093/bioinformatics/btz349 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb/Eccb 2019 Conference Proceedings Tolstoganov, Ivan Bankevich, Anton Chen, Zhoutao Pevzner, Pavel A cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs |
title | cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs |
title_full | cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs |
title_fullStr | cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs |
title_full_unstemmed | cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs |
title_short | cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs |
title_sort | cloudspades: assembly of synthetic long reads using de bruijn graphs |
topic | Ismb/Eccb 2019 Conference Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612831/ https://www.ncbi.nlm.nih.gov/pubmed/31510642 http://dx.doi.org/10.1093/bioinformatics/btz349 |
work_keys_str_mv | AT tolstoganovivan cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs AT bankevichanton cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs AT chenzhoutao cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs AT pevznerpavela cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs |