Cargando…

Using cascading Bloom filters to improve the memory usage for de Brujin graphs

BACKGROUND: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. RESULTS: In this work,...

Descripción completa

Detalles Bibliográficos
Autores principales: Salikhov, Kamil, Sacomoto, Gustavo, Kucherov, Gregory
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3974045/
https://www.ncbi.nlm.nih.gov/pubmed/24565280
http://dx.doi.org/10.1186/1748-7188-9-2
_version_ 1782479419547320320
author Salikhov, Kamil
Sacomoto, Gustavo
Kucherov, Gregory
author_facet Salikhov, Kamil
Sacomoto, Gustavo
Kucherov, Gregory
author_sort Salikhov, Kamil
collection PubMed
description BACKGROUND: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. RESULTS: In this work, we show how to reduce the memory required by the data structure of Chikhi and Rizk (WABI’12) that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to their method, with insignificant impact on construction time. At the same time, our experiments showed a better query time compared to the method of Chikhi and Rizk. CONCLUSION: The proposed data structure constitutes, to our knowledge, currently the most efficient practical representation of de Bruijn graphs.
format Online
Article
Text
id pubmed-3974045
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39740452014-04-11 Using cascading Bloom filters to improve the memory usage for de Brujin graphs Salikhov, Kamil Sacomoto, Gustavo Kucherov, Gregory Algorithms Mol Biol Research BACKGROUND: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. RESULTS: In this work, we show how to reduce the memory required by the data structure of Chikhi and Rizk (WABI’12) that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to their method, with insignificant impact on construction time. At the same time, our experiments showed a better query time compared to the method of Chikhi and Rizk. CONCLUSION: The proposed data structure constitutes, to our knowledge, currently the most efficient practical representation of de Bruijn graphs. BioMed Central 2014-02-24 /pmc/articles/PMC3974045/ /pubmed/24565280 http://dx.doi.org/10.1186/1748-7188-9-2 Text en Copyright © 2014 Salikhov et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Salikhov, Kamil
Sacomoto, Gustavo
Kucherov, Gregory
Using cascading Bloom filters to improve the memory usage for de Brujin graphs
title Using cascading Bloom filters to improve the memory usage for de Brujin graphs
title_full Using cascading Bloom filters to improve the memory usage for de Brujin graphs
title_fullStr Using cascading Bloom filters to improve the memory usage for de Brujin graphs
title_full_unstemmed Using cascading Bloom filters to improve the memory usage for de Brujin graphs
title_short Using cascading Bloom filters to improve the memory usage for de Brujin graphs
title_sort using cascading bloom filters to improve the memory usage for de brujin graphs
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3974045/
https://www.ncbi.nlm.nih.gov/pubmed/24565280
http://dx.doi.org/10.1186/1748-7188-9-2
work_keys_str_mv AT salikhovkamil usingcascadingbloomfilterstoimprovethememoryusagefordebrujingraphs
AT sacomotogustavo usingcascadingbloomfilterstoimprovethememoryusagefordebrujingraphs
AT kucherovgregory usingcascadingbloomfilterstoimprovethememoryusagefordebrujingraphs