Cargando…

Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices

MOTIVATION: Efficiently aligning sequences is a fundamental problem in bioinformatics. Many recent algorithms for computing alignments through Smith–Waterman–Gotoh dynamic programming (DP) exploit Single Instruction Multiple Data (SIMD) operations on modern CPUs for speed. However, these advances ha...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Daniel, Steinegger, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10457662/
https://www.ncbi.nlm.nih.gov/pubmed/37535681
http://dx.doi.org/10.1093/bioinformatics/btad487
_version_ 1785096980042612736
author Liu, Daniel
Steinegger, Martin
author_facet Liu, Daniel
Steinegger, Martin
author_sort Liu, Daniel
collection PubMed
description MOTIVATION: Efficiently aligning sequences is a fundamental problem in bioinformatics. Many recent algorithms for computing alignments through Smith–Waterman–Gotoh dynamic programming (DP) exploit Single Instruction Multiple Data (SIMD) operations on modern CPUs for speed. However, these advances have largely ignored difficulties associated with efficiently handling complex scoring matrices or large gaps (insertions or deletions). RESULTS: We propose a new SIMD-accelerated algorithm called Block Aligner for aligning nucleotide and protein sequences against other sequences or position-specific scoring matrices. We introduce a new paradigm that uses blocks in the DP matrix that greedily shift, grow, and shrink. This approach allows regions of the DP matrix to be adaptively computed. Our algorithm reaches over 5–10 times faster than some previous methods while incurring an error rate of less than 3% on protein and long read datasets, despite large gaps and low sequence identities. AVAILABILITY AND IMPLEMENTATION: Our algorithm is implemented for global, local, and X-drop alignments. It is available as a Rust library (with C bindings) at https://github.com/Daniel-Liu-c0deb0t/block-aligner.
format Online
Article
Text
id pubmed-10457662
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-104576622023-08-27 Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices Liu, Daniel Steinegger, Martin Bioinformatics Original Paper MOTIVATION: Efficiently aligning sequences is a fundamental problem in bioinformatics. Many recent algorithms for computing alignments through Smith–Waterman–Gotoh dynamic programming (DP) exploit Single Instruction Multiple Data (SIMD) operations on modern CPUs for speed. However, these advances have largely ignored difficulties associated with efficiently handling complex scoring matrices or large gaps (insertions or deletions). RESULTS: We propose a new SIMD-accelerated algorithm called Block Aligner for aligning nucleotide and protein sequences against other sequences or position-specific scoring matrices. We introduce a new paradigm that uses blocks in the DP matrix that greedily shift, grow, and shrink. This approach allows regions of the DP matrix to be adaptively computed. Our algorithm reaches over 5–10 times faster than some previous methods while incurring an error rate of less than 3% on protein and long read datasets, despite large gaps and low sequence identities. AVAILABILITY AND IMPLEMENTATION: Our algorithm is implemented for global, local, and X-drop alignments. It is available as a Rust library (with C bindings) at https://github.com/Daniel-Liu-c0deb0t/block-aligner. Oxford University Press 2023-08-03 /pmc/articles/PMC10457662/ /pubmed/37535681 http://dx.doi.org/10.1093/bioinformatics/btad487 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Liu, Daniel
Steinegger, Martin
Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices
title Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices
title_full Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices
title_fullStr Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices
title_full_unstemmed Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices
title_short Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices
title_sort block aligner: an adaptive simd-accelerated aligner for sequences and position-specific scoring matrices
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10457662/
https://www.ncbi.nlm.nih.gov/pubmed/37535681
http://dx.doi.org/10.1093/bioinformatics/btad487
work_keys_str_mv AT liudaniel blockaligneranadaptivesimdacceleratedalignerforsequencesandpositionspecificscoringmatrices
AT steineggermartin blockaligneranadaptivesimdacceleratedalignerforsequencesandpositionspecificscoringmatrices