Cargando…
BubbleGun: enumerating bubbles and superbubbles in genome graphs
MOTIVATION: With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438957/ https://www.ncbi.nlm.nih.gov/pubmed/35799353 http://dx.doi.org/10.1093/bioinformatics/btac448 |
_version_ | 1784781942332325888 |
---|---|
author | Dabbaghie, Fawaz Ebler, Jana Marschall, Tobias |
author_facet | Dabbaghie, Fawaz Ebler, Jana Marschall, Tobias |
author_sort | Dabbaghie, Fawaz |
collection | PubMed |
description | MOTIVATION: With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases, polymorphic loci lead to bubble structures in such graphs. Detecting bubbles is hence an important task when working with genomic variants in the context of genome graphs. RESULTS: Here, we present a fast general-purpose tool, called BubbleGun, for detecting bubbles and superbubbles in genome graphs. Furthermore, BubbleGun detects and outputs runs of linearly connected bubbles and superbubbles, which we call bubble chains. We showcase its utility on de Bruijn graphs and compare our results to vg’s snarl detection. We show that BubbleGun is considerably faster than vg especially in bigger graphs, where it reports all bubbles in less than 30 min on a human sample de Bruijn graph of around 2 million nodes. AVAILABILITY AND IMPLEMENTATION: BubbleGun is available and documented as a Python3 package at https://github.com/fawaz-dabbaghieh/bubble_gun under MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9438957 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-94389572022-09-06 BubbleGun: enumerating bubbles and superbubbles in genome graphs Dabbaghie, Fawaz Ebler, Jana Marschall, Tobias Bioinformatics Applications Note MOTIVATION: With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases, polymorphic loci lead to bubble structures in such graphs. Detecting bubbles is hence an important task when working with genomic variants in the context of genome graphs. RESULTS: Here, we present a fast general-purpose tool, called BubbleGun, for detecting bubbles and superbubbles in genome graphs. Furthermore, BubbleGun detects and outputs runs of linearly connected bubbles and superbubbles, which we call bubble chains. We showcase its utility on de Bruijn graphs and compare our results to vg’s snarl detection. We show that BubbleGun is considerably faster than vg especially in bigger graphs, where it reports all bubbles in less than 30 min on a human sample de Bruijn graph of around 2 million nodes. AVAILABILITY AND IMPLEMENTATION: BubbleGun is available and documented as a Python3 package at https://github.com/fawaz-dabbaghieh/bubble_gun under MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-07-07 /pmc/articles/PMC9438957/ /pubmed/35799353 http://dx.doi.org/10.1093/bioinformatics/btac448 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Dabbaghie, Fawaz Ebler, Jana Marschall, Tobias BubbleGun: enumerating bubbles and superbubbles in genome graphs |
title | BubbleGun: enumerating bubbles and superbubbles in genome graphs |
title_full | BubbleGun: enumerating bubbles and superbubbles in genome graphs |
title_fullStr | BubbleGun: enumerating bubbles and superbubbles in genome graphs |
title_full_unstemmed | BubbleGun: enumerating bubbles and superbubbles in genome graphs |
title_short | BubbleGun: enumerating bubbles and superbubbles in genome graphs |
title_sort | bubblegun: enumerating bubbles and superbubbles in genome graphs |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438957/ https://www.ncbi.nlm.nih.gov/pubmed/35799353 http://dx.doi.org/10.1093/bioinformatics/btac448 |
work_keys_str_mv | AT dabbaghiefawaz bubblegunenumeratingbubblesandsuperbubblesingenomegraphs AT eblerjana bubblegunenumeratingbubblesandsuperbubblesingenomegraphs AT marschalltobias bubblegunenumeratingbubblesandsuperbubblesingenomegraphs |