Cargando…
Optimal gap-affine alignment in O(s) space
MOTIVATION: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algor...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940620/ https://www.ncbi.nlm.nih.gov/pubmed/36749013 http://dx.doi.org/10.1093/bioinformatics/btad074 |
_version_ | 1784891120754360320 |
---|---|
author | Marco-Sola, Santiago Eizenga, Jordan M Guarracino, Andrea Paten, Benedict Garrison, Erik Moreto, Miquel |
author_facet | Marco-Sola, Santiago Eizenga, Jordan M Guarracino, Andrea Paten, Benedict Garrison, Erik Moreto, Miquel |
author_sort | Marco-Sola, Santiago |
collection | PubMed |
description | MOTIVATION: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in [Formula: see text] time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s [Formula: see text] memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. RESULTS: In this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in [Formula: see text] memory while retaining WFA’s time complexity of [Formula: see text]. As a result, this work improves the lowest known memory bound [Formula: see text] to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. AVAILABILITY AND IMPLEMENTATION: All code is publicly available at https://github.com/smarco/BiWFA-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9940620 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-99406202023-02-21 Optimal gap-affine alignment in O(s) space Marco-Sola, Santiago Eizenga, Jordan M Guarracino, Andrea Paten, Benedict Garrison, Erik Moreto, Miquel Bioinformatics Original Paper MOTIVATION: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in [Formula: see text] time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s [Formula: see text] memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. RESULTS: In this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in [Formula: see text] memory while retaining WFA’s time complexity of [Formula: see text]. As a result, this work improves the lowest known memory bound [Formula: see text] to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. AVAILABILITY AND IMPLEMENTATION: All code is publicly available at https://github.com/smarco/BiWFA-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2023-02-07 /pmc/articles/PMC9940620/ /pubmed/36749013 http://dx.doi.org/10.1093/bioinformatics/btad074 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Marco-Sola, Santiago Eizenga, Jordan M Guarracino, Andrea Paten, Benedict Garrison, Erik Moreto, Miquel Optimal gap-affine alignment in O(s) space |
title | Optimal gap-affine alignment in O(s) space |
title_full | Optimal gap-affine alignment in O(s) space |
title_fullStr | Optimal gap-affine alignment in O(s) space |
title_full_unstemmed | Optimal gap-affine alignment in O(s) space |
title_short | Optimal gap-affine alignment in O(s) space |
title_sort | optimal gap-affine alignment in o(s) space |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940620/ https://www.ncbi.nlm.nih.gov/pubmed/36749013 http://dx.doi.org/10.1093/bioinformatics/btad074 |
work_keys_str_mv | AT marcosolasantiago optimalgapaffinealignmentinosspace AT eizengajordanm optimalgapaffinealignmentinosspace AT guarracinoandrea optimalgapaffinealignmentinosspace AT patenbenedict optimalgapaffinealignmentinosspace AT garrisonerik optimalgapaffinealignmentinosspace AT moretomiquel optimalgapaffinealignmentinosspace |