Cargando…

Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ

Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we devel...

Descripción completa

Detalles Bibliográficos
Autores principales: Minkin, Ilia, Medvedev, Paul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303978/
https://www.ncbi.nlm.nih.gov/pubmed/32563153
http://dx.doi.org/10.1016/j.isci.2020.101224
_version_ 1783548171687821312
author Minkin, Ilia
Medvedev, Paul
author_facet Minkin, Ilia
Medvedev, Paul
author_sort Minkin, Ilia
collection PubMed
description Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we develop an algorithm (BubbZ) for computing whole-genome pairwise homology mappings, especially in the context of all-to-all comparison for multiple genomes. BubbZ is based on an algorithm for computing chains in compacted de Bruijn graphs. We evaluate BubbZ on simulated datasets, a dataset composed of 16 long mouse genomes, and a large dataset of 1,600 Salmonella genomes. We show up to approximately an order of magnitude speed improvement, compared with MashMap2 and Minimap2, while retaining similar accuracy.
format Online
Article
Text
id pubmed-7303978
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-73039782020-06-22 Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ Minkin, Ilia Medvedev, Paul iScience Article Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we develop an algorithm (BubbZ) for computing whole-genome pairwise homology mappings, especially in the context of all-to-all comparison for multiple genomes. BubbZ is based on an algorithm for computing chains in compacted de Bruijn graphs. We evaluate BubbZ on simulated datasets, a dataset composed of 16 long mouse genomes, and a large dataset of 1,600 Salmonella genomes. We show up to approximately an order of magnitude speed improvement, compared with MashMap2 and Minimap2, while retaining similar accuracy. Elsevier 2020-06-03 /pmc/articles/PMC7303978/ /pubmed/32563153 http://dx.doi.org/10.1016/j.isci.2020.101224 Text en © 2020 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Minkin, Ilia
Medvedev, Paul
Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
title Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
title_full Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
title_fullStr Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
title_full_unstemmed Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
title_short Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
title_sort scalable pairwise whole-genome homology mapping of long genomes with bubbz
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303978/
https://www.ncbi.nlm.nih.gov/pubmed/32563153
http://dx.doi.org/10.1016/j.isci.2020.101224
work_keys_str_mv AT minkinilia scalablepairwisewholegenomehomologymappingoflonggenomeswithbubbz
AT medvedevpaul scalablepairwisewholegenomehomologymappingoflonggenomeswithbubbz