Cargando…

Optimal gap-affine alignment in O(s) space

MOTIVATION: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algor...

Descripción completa

Detalles Bibliográficos
Autores principales: Marco-Sola, Santiago, Eizenga, Jordan M, Guarracino, Andrea, Paten, Benedict, Garrison, Erik, Moreto, Miquel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940620/
https://www.ncbi.nlm.nih.gov/pubmed/36749013
http://dx.doi.org/10.1093/bioinformatics/btad074
_version_ 1784891120754360320
author Marco-Sola, Santiago
Eizenga, Jordan M
Guarracino, Andrea
Paten, Benedict
Garrison, Erik
Moreto, Miquel
author_facet Marco-Sola, Santiago
Eizenga, Jordan M
Guarracino, Andrea
Paten, Benedict
Garrison, Erik
Moreto, Miquel
author_sort Marco-Sola, Santiago
collection PubMed
description MOTIVATION: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in [Formula: see text] time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s [Formula: see text] memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. RESULTS: In this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in [Formula: see text] memory while retaining WFA’s time complexity of [Formula: see text]. As a result, this work improves the lowest known memory bound [Formula: see text] to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. AVAILABILITY AND IMPLEMENTATION: All code is publicly available at https://github.com/smarco/BiWFA-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9940620
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-99406202023-02-21 Optimal gap-affine alignment in O(s) space Marco-Sola, Santiago Eizenga, Jordan M Guarracino, Andrea Paten, Benedict Garrison, Erik Moreto, Miquel Bioinformatics Original Paper MOTIVATION: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in [Formula: see text] time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s [Formula: see text] memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. RESULTS: In this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in [Formula: see text] memory while retaining WFA’s time complexity of [Formula: see text]. As a result, this work improves the lowest known memory bound [Formula: see text] to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. AVAILABILITY AND IMPLEMENTATION: All code is publicly available at https://github.com/smarco/BiWFA-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2023-02-07 /pmc/articles/PMC9940620/ /pubmed/36749013 http://dx.doi.org/10.1093/bioinformatics/btad074 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Marco-Sola, Santiago
Eizenga, Jordan M
Guarracino, Andrea
Paten, Benedict
Garrison, Erik
Moreto, Miquel
Optimal gap-affine alignment in O(s) space
title Optimal gap-affine alignment in O(s) space
title_full Optimal gap-affine alignment in O(s) space
title_fullStr Optimal gap-affine alignment in O(s) space
title_full_unstemmed Optimal gap-affine alignment in O(s) space
title_short Optimal gap-affine alignment in O(s) space
title_sort optimal gap-affine alignment in o(s) space
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940620/
https://www.ncbi.nlm.nih.gov/pubmed/36749013
http://dx.doi.org/10.1093/bioinformatics/btad074
work_keys_str_mv AT marcosolasantiago optimalgapaffinealignmentinosspace
AT eizengajordanm optimalgapaffinealignmentinosspace
AT guarracinoandrea optimalgapaffinealignmentinosspace
AT patenbenedict optimalgapaffinealignmentinosspace
AT garrisonerik optimalgapaffinealignmentinosspace
AT moretomiquel optimalgapaffinealignmentinosspace