Cargando…

BubbleGun: enumerating bubbles and superbubbles in genome graphs

MOTIVATION: With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases,...

Descripción completa

Detalles Bibliográficos
Autores principales: Dabbaghie, Fawaz, Ebler, Jana, Marschall, Tobias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438957/
https://www.ncbi.nlm.nih.gov/pubmed/35799353
http://dx.doi.org/10.1093/bioinformatics/btac448
_version_ 1784781942332325888
author Dabbaghie, Fawaz
Ebler, Jana
Marschall, Tobias
author_facet Dabbaghie, Fawaz
Ebler, Jana
Marschall, Tobias
author_sort Dabbaghie, Fawaz
collection PubMed
description MOTIVATION: With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases, polymorphic loci lead to bubble structures in such graphs. Detecting bubbles is hence an important task when working with genomic variants in the context of genome graphs. RESULTS: Here, we present a fast general-purpose tool, called BubbleGun, for detecting bubbles and superbubbles in genome graphs. Furthermore, BubbleGun detects and outputs runs of linearly connected bubbles and superbubbles, which we call bubble chains. We showcase its utility on de Bruijn graphs and compare our results to vg’s snarl detection. We show that BubbleGun is considerably faster than vg especially in bigger graphs, where it reports all bubbles in less than 30 min on a human sample de Bruijn graph of around 2 million nodes. AVAILABILITY AND IMPLEMENTATION: BubbleGun is available and documented as a Python3 package at https://github.com/fawaz-dabbaghieh/bubble_gun under MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9438957
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94389572022-09-06 BubbleGun: enumerating bubbles and superbubbles in genome graphs Dabbaghie, Fawaz Ebler, Jana Marschall, Tobias Bioinformatics Applications Note MOTIVATION: With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases, polymorphic loci lead to bubble structures in such graphs. Detecting bubbles is hence an important task when working with genomic variants in the context of genome graphs. RESULTS: Here, we present a fast general-purpose tool, called BubbleGun, for detecting bubbles and superbubbles in genome graphs. Furthermore, BubbleGun detects and outputs runs of linearly connected bubbles and superbubbles, which we call bubble chains. We showcase its utility on de Bruijn graphs and compare our results to vg’s snarl detection. We show that BubbleGun is considerably faster than vg especially in bigger graphs, where it reports all bubbles in less than 30 min on a human sample de Bruijn graph of around 2 million nodes. AVAILABILITY AND IMPLEMENTATION: BubbleGun is available and documented as a Python3 package at https://github.com/fawaz-dabbaghieh/bubble_gun under MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-07-07 /pmc/articles/PMC9438957/ /pubmed/35799353 http://dx.doi.org/10.1093/bioinformatics/btac448 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Dabbaghie, Fawaz
Ebler, Jana
Marschall, Tobias
BubbleGun: enumerating bubbles and superbubbles in genome graphs
title BubbleGun: enumerating bubbles and superbubbles in genome graphs
title_full BubbleGun: enumerating bubbles and superbubbles in genome graphs
title_fullStr BubbleGun: enumerating bubbles and superbubbles in genome graphs
title_full_unstemmed BubbleGun: enumerating bubbles and superbubbles in genome graphs
title_short BubbleGun: enumerating bubbles and superbubbles in genome graphs
title_sort bubblegun: enumerating bubbles and superbubbles in genome graphs
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438957/
https://www.ncbi.nlm.nih.gov/pubmed/35799353
http://dx.doi.org/10.1093/bioinformatics/btac448
work_keys_str_mv AT dabbaghiefawaz bubblegunenumeratingbubblesandsuperbubblesingenomegraphs
AT eblerjana bubblegunenumeratingbubblesandsuperbubblesingenomegraphs
AT marschalltobias bubblegunenumeratingbubblesandsuperbubblesingenomegraphs