Cargando…

cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs

MOTIVATION: The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers...

Descripción completa

Detalles Bibliográficos
Autores principales: Tolstoganov, Ivan, Bankevich, Anton, Chen, Zhoutao, Pevzner, Pavel A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612831/
https://www.ncbi.nlm.nih.gov/pubmed/31510642
http://dx.doi.org/10.1093/bioinformatics/btz349
_version_ 1783432946836832256
author Tolstoganov, Ivan
Bankevich, Anton
Chen, Zhoutao
Pevzner, Pavel A
author_facet Tolstoganov, Ivan
Bankevich, Anton
Chen, Zhoutao
Pevzner, Pavel A
author_sort Tolstoganov, Ivan
collection PubMed
description MOTIVATION: The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. RESULTS: We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. AVAILABILITY AND IMPLEMENTATION: Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6612831
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-66128312019-07-12 cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs Tolstoganov, Ivan Bankevich, Anton Chen, Zhoutao Pevzner, Pavel A Bioinformatics Ismb/Eccb 2019 Conference Proceedings MOTIVATION: The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. RESULTS: We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. AVAILABILITY AND IMPLEMENTATION: Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-07 2019-07-05 /pmc/articles/PMC6612831/ /pubmed/31510642 http://dx.doi.org/10.1093/bioinformatics/btz349 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb/Eccb 2019 Conference Proceedings
Tolstoganov, Ivan
Bankevich, Anton
Chen, Zhoutao
Pevzner, Pavel A
cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
title cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
title_full cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
title_fullStr cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
title_full_unstemmed cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
title_short cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs
title_sort cloudspades: assembly of synthetic long reads using de bruijn graphs
topic Ismb/Eccb 2019 Conference Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612831/
https://www.ncbi.nlm.nih.gov/pubmed/31510642
http://dx.doi.org/10.1093/bioinformatics/btz349
work_keys_str_mv AT tolstoganovivan cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs
AT bankevichanton cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs
AT chenzhoutao cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs
AT pevznerpavela cloudspadesassemblyofsyntheticlongreadsusingdebruijngraphs