Cargando…
Using cascading Bloom filters to improve the memory usage for de Brujin graphs
BACKGROUND: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. RESULTS: In this work,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3974045/ https://www.ncbi.nlm.nih.gov/pubmed/24565280 http://dx.doi.org/10.1186/1748-7188-9-2 |
_version_ | 1782479419547320320 |
---|---|
author | Salikhov, Kamil Sacomoto, Gustavo Kucherov, Gregory |
author_facet | Salikhov, Kamil Sacomoto, Gustavo Kucherov, Gregory |
author_sort | Salikhov, Kamil |
collection | PubMed |
description | BACKGROUND: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. RESULTS: In this work, we show how to reduce the memory required by the data structure of Chikhi and Rizk (WABI’12) that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to their method, with insignificant impact on construction time. At the same time, our experiments showed a better query time compared to the method of Chikhi and Rizk. CONCLUSION: The proposed data structure constitutes, to our knowledge, currently the most efficient practical representation of de Bruijn graphs. |
format | Online Article Text |
id | pubmed-3974045 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39740452014-04-11 Using cascading Bloom filters to improve the memory usage for de Brujin graphs Salikhov, Kamil Sacomoto, Gustavo Kucherov, Gregory Algorithms Mol Biol Research BACKGROUND: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. RESULTS: In this work, we show how to reduce the memory required by the data structure of Chikhi and Rizk (WABI’12) that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to their method, with insignificant impact on construction time. At the same time, our experiments showed a better query time compared to the method of Chikhi and Rizk. CONCLUSION: The proposed data structure constitutes, to our knowledge, currently the most efficient practical representation of de Bruijn graphs. BioMed Central 2014-02-24 /pmc/articles/PMC3974045/ /pubmed/24565280 http://dx.doi.org/10.1186/1748-7188-9-2 Text en Copyright © 2014 Salikhov et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Salikhov, Kamil Sacomoto, Gustavo Kucherov, Gregory Using cascading Bloom filters to improve the memory usage for de Brujin graphs |
title | Using cascading Bloom filters to improve the memory usage for de Brujin graphs |
title_full | Using cascading Bloom filters to improve the memory usage for de Brujin graphs |
title_fullStr | Using cascading Bloom filters to improve the memory usage for de Brujin graphs |
title_full_unstemmed | Using cascading Bloom filters to improve the memory usage for de Brujin graphs |
title_short | Using cascading Bloom filters to improve the memory usage for de Brujin graphs |
title_sort | using cascading bloom filters to improve the memory usage for de brujin graphs |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3974045/ https://www.ncbi.nlm.nih.gov/pubmed/24565280 http://dx.doi.org/10.1186/1748-7188-9-2 |
work_keys_str_mv | AT salikhovkamil usingcascadingbloomfilterstoimprovethememoryusagefordebrujingraphs AT sacomotogustavo usingcascadingbloomfilterstoimprovethememoryusagefordebrujingraphs AT kucherovgregory usingcascadingbloomfilterstoimprovethememoryusagefordebrujingraphs |