Cargando…
Readjoiner: a fast and memory efficient string graph-based sequence assembler
BACKGROUND: Ongoing improvements in throughput of the next-generation sequencing technologies challenge the current generation of de novo sequence assemblers. Most recent sequence assemblers are based on the construction of a de Bruijn graph. An alternative framework of growing interest is the assem...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507659/ https://www.ncbi.nlm.nih.gov/pubmed/22559072 http://dx.doi.org/10.1186/1471-2105-13-82 |
_version_ | 1782251103030607872 |
---|---|
author | Gonnella, Giorgio Kurtz, Stefan |
author_facet | Gonnella, Giorgio Kurtz, Stefan |
author_sort | Gonnella, Giorgio |
collection | PubMed |
description | BACKGROUND: Ongoing improvements in throughput of the next-generation sequencing technologies challenge the current generation of de novo sequence assemblers. Most recent sequence assemblers are based on the construction of a de Bruijn graph. An alternative framework of growing interest is the assembly string graph, not necessitating a division of the reads into k-mers, but requiring fast algorithms for the computation of suffix-prefix matches among all pairs of reads. RESULTS: Here we present efficient methods for the construction of a string graph from a set of sequencing reads. Our approach employs suffix sorting and scanning methods to compute suffix-prefix matches. Transitive edges are recognized and eliminated early in the process and the graph is efficiently constructed including irreducible edges only. CONCLUSIONS: Our suffix-prefix match determination and string graph construction algorithms have been implemented in the software package Readjoiner. Comparison with existing string graph-based assemblers shows that Readjoiner is faster and more space efficient. Readjoiner is available at http://www.zbh.uni-hamburg.de/readjoiner. |
format | Online Article Text |
id | pubmed-3507659 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35076592012-12-03 Readjoiner: a fast and memory efficient string graph-based sequence assembler Gonnella, Giorgio Kurtz, Stefan BMC Bioinformatics Methodology Article BACKGROUND: Ongoing improvements in throughput of the next-generation sequencing technologies challenge the current generation of de novo sequence assemblers. Most recent sequence assemblers are based on the construction of a de Bruijn graph. An alternative framework of growing interest is the assembly string graph, not necessitating a division of the reads into k-mers, but requiring fast algorithms for the computation of suffix-prefix matches among all pairs of reads. RESULTS: Here we present efficient methods for the construction of a string graph from a set of sequencing reads. Our approach employs suffix sorting and scanning methods to compute suffix-prefix matches. Transitive edges are recognized and eliminated early in the process and the graph is efficiently constructed including irreducible edges only. CONCLUSIONS: Our suffix-prefix match determination and string graph construction algorithms have been implemented in the software package Readjoiner. Comparison with existing string graph-based assemblers shows that Readjoiner is faster and more space efficient. Readjoiner is available at http://www.zbh.uni-hamburg.de/readjoiner. BioMed Central 2012-05-06 /pmc/articles/PMC3507659/ /pubmed/22559072 http://dx.doi.org/10.1186/1471-2105-13-82 Text en Copyright ©2012 Gonnella and Kurtz; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Gonnella, Giorgio Kurtz, Stefan Readjoiner: a fast and memory efficient string graph-based sequence assembler |
title | Readjoiner: a fast and memory efficient string graph-based sequence assembler |
title_full | Readjoiner: a fast and memory efficient string graph-based sequence assembler |
title_fullStr | Readjoiner: a fast and memory efficient string graph-based sequence assembler |
title_full_unstemmed | Readjoiner: a fast and memory efficient string graph-based sequence assembler |
title_short | Readjoiner: a fast and memory efficient string graph-based sequence assembler |
title_sort | readjoiner: a fast and memory efficient string graph-based sequence assembler |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507659/ https://www.ncbi.nlm.nih.gov/pubmed/22559072 http://dx.doi.org/10.1186/1471-2105-13-82 |
work_keys_str_mv | AT gonnellagiorgio readjoinerafastandmemoryefficientstringgraphbasedsequenceassembler AT kurtzstefan readjoinerafastandmemoryefficientstringgraphbasedsequenceassembler |