Cargando…
Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ
Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we devel...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303978/ https://www.ncbi.nlm.nih.gov/pubmed/32563153 http://dx.doi.org/10.1016/j.isci.2020.101224 |
_version_ | 1783548171687821312 |
---|---|
author | Minkin, Ilia Medvedev, Paul |
author_facet | Minkin, Ilia Medvedev, Paul |
author_sort | Minkin, Ilia |
collection | PubMed |
description | Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we develop an algorithm (BubbZ) for computing whole-genome pairwise homology mappings, especially in the context of all-to-all comparison for multiple genomes. BubbZ is based on an algorithm for computing chains in compacted de Bruijn graphs. We evaluate BubbZ on simulated datasets, a dataset composed of 16 long mouse genomes, and a large dataset of 1,600 Salmonella genomes. We show up to approximately an order of magnitude speed improvement, compared with MashMap2 and Minimap2, while retaining similar accuracy. |
format | Online Article Text |
id | pubmed-7303978 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-73039782020-06-22 Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ Minkin, Ilia Medvedev, Paul iScience Article Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we develop an algorithm (BubbZ) for computing whole-genome pairwise homology mappings, especially in the context of all-to-all comparison for multiple genomes. BubbZ is based on an algorithm for computing chains in compacted de Bruijn graphs. We evaluate BubbZ on simulated datasets, a dataset composed of 16 long mouse genomes, and a large dataset of 1,600 Salmonella genomes. We show up to approximately an order of magnitude speed improvement, compared with MashMap2 and Minimap2, while retaining similar accuracy. Elsevier 2020-06-03 /pmc/articles/PMC7303978/ /pubmed/32563153 http://dx.doi.org/10.1016/j.isci.2020.101224 Text en © 2020 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Minkin, Ilia Medvedev, Paul Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ |
title | Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ |
title_full | Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ |
title_fullStr | Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ |
title_full_unstemmed | Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ |
title_short | Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ |
title_sort | scalable pairwise whole-genome homology mapping of long genomes with bubbz |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303978/ https://www.ncbi.nlm.nih.gov/pubmed/32563153 http://dx.doi.org/10.1016/j.isci.2020.101224 |
work_keys_str_mv | AT minkinilia scalablepairwisewholegenomehomologymappingoflonggenomeswithbubbz AT medvedevpaul scalablepairwisewholegenomehomologymappingoflonggenomeswithbubbz |