Cargando…
RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures
SUMMARY: We propose RabbitKSSD, a high-speed genome distance estimation tool. Specifically, we leverage load-balanced task partitioning, fast I/O, efficient intermediate result accesses, and high-performance data structures to improve overall efficiency. Our performance evaluation demonstrates that...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10681859/ https://www.ncbi.nlm.nih.gov/pubmed/37971961 http://dx.doi.org/10.1093/bioinformatics/btad695 |
_version_ | 1785150847896780800 |
---|---|
author | Xu, Xiaoming Yin, Zekun Yan, Lifeng Yi, Huiguang Wang, Hua Schmidt, Bertil Liu, Weiguo |
author_facet | Xu, Xiaoming Yin, Zekun Yan, Lifeng Yi, Huiguang Wang, Hua Schmidt, Bertil Liu, Weiguo |
author_sort | Xu, Xiaoming |
collection | PubMed |
description | SUMMARY: We propose RabbitKSSD, a high-speed genome distance estimation tool. Specifically, we leverage load-balanced task partitioning, fast I/O, efficient intermediate result accesses, and high-performance data structures to improve overall efficiency. Our performance evaluation demonstrates that RabbitKSSD achieves speedups ranging from 5.7× to 19.8× over Kssd for the time-consuming sketch generation and distance computation on commonly used workstations. In addition, it significantly outperforms Mash, BinDash, and Dashing2. Moreover, RabbitKSSD can efficiently perform all-vs-all distance computation for all RefSeq complete bacterial genomes (455 GB in FASTA format) in just 2 min on a 64-core workstation. AVAILABILITY AND IMPLEMENTATION: RabbitKSSD is available at https://github.com/RabbitBio/RabbitKSSD. |
format | Online Article Text |
id | pubmed-10681859 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106818592023-11-30 RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures Xu, Xiaoming Yin, Zekun Yan, Lifeng Yi, Huiguang Wang, Hua Schmidt, Bertil Liu, Weiguo Bioinformatics Applications Note SUMMARY: We propose RabbitKSSD, a high-speed genome distance estimation tool. Specifically, we leverage load-balanced task partitioning, fast I/O, efficient intermediate result accesses, and high-performance data structures to improve overall efficiency. Our performance evaluation demonstrates that RabbitKSSD achieves speedups ranging from 5.7× to 19.8× over Kssd for the time-consuming sketch generation and distance computation on commonly used workstations. In addition, it significantly outperforms Mash, BinDash, and Dashing2. Moreover, RabbitKSSD can efficiently perform all-vs-all distance computation for all RefSeq complete bacterial genomes (455 GB in FASTA format) in just 2 min on a 64-core workstation. AVAILABILITY AND IMPLEMENTATION: RabbitKSSD is available at https://github.com/RabbitBio/RabbitKSSD. Oxford University Press 2023-11-16 /pmc/articles/PMC10681859/ /pubmed/37971961 http://dx.doi.org/10.1093/bioinformatics/btad695 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Xu, Xiaoming Yin, Zekun Yan, Lifeng Yi, Huiguang Wang, Hua Schmidt, Bertil Liu, Weiguo RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures |
title | RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures |
title_full | RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures |
title_fullStr | RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures |
title_full_unstemmed | RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures |
title_short | RabbitKSSD: accelerating genome distance estimation on modern multi-core architectures |
title_sort | rabbitkssd: accelerating genome distance estimation on modern multi-core architectures |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10681859/ https://www.ncbi.nlm.nih.gov/pubmed/37971961 http://dx.doi.org/10.1093/bioinformatics/btad695 |
work_keys_str_mv | AT xuxiaoming rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures AT yinzekun rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures AT yanlifeng rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures AT yihuiguang rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures AT wanghua rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures AT schmidtbertil rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures AT liuweiguo rabbitkssdacceleratinggenomedistanceestimationonmodernmulticorearchitectures |