Cargando…
Seedability: optimizing alignment parameters for sensitive sequence comparison
MOTIVATION: Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as [Formula: see text] , use a single value of k per sequencing technology, withou...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444664/ https://www.ncbi.nlm.nih.gov/pubmed/37621456 http://dx.doi.org/10.1093/bioadv/vbad108 |
_version_ | 1785093998988230656 |
---|---|
author | Ayad, Lorraine A K Chikhi, Rayan Pissis, Solon P |
author_facet | Ayad, Lorraine A K Chikhi, Rayan Pissis, Solon P |
author_sort | Ayad, Lorraine A K |
collection | PubMed |
description | MOTIVATION: Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as [Formula: see text] , use a single value of k per sequencing technology, without a strong guarantee that this is the best possible value. Given the ubiquity of sequence alignment, identifying values of k that lead to more sensitive alignments is thus an important task. To aid this, we present [Formula: see text] , a seed-based alignment framework designed for estimating an optimal seed k-mer length (as well as a minimal number of shared seeds) based on a given alignment identity threshold. In particular, we were motivated to make [Formula: see text] more sensitive in the pairwise alignment of short sequences. RESULTS: The experimental results herein show improved alignments of short and divergent sequences when using the parameter values determined by [Formula: see text] in comparison to the default values of [Formula: see text]. We also show several cases of pairs of real divergent sequences, where the default parameter values of [Formula: see text] yield no output alignments, but the values output by [Formula: see text] produce plausible alignments. AVAILABILITY AND IMPLEMENTATION: https://github.com/lorrainea/Seedability (distributed under GPL v3.0). |
format | Online Article Text |
id | pubmed-10444664 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-104446642023-08-24 Seedability: optimizing alignment parameters for sensitive sequence comparison Ayad, Lorraine A K Chikhi, Rayan Pissis, Solon P Bioinform Adv Original Article MOTIVATION: Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as [Formula: see text] , use a single value of k per sequencing technology, without a strong guarantee that this is the best possible value. Given the ubiquity of sequence alignment, identifying values of k that lead to more sensitive alignments is thus an important task. To aid this, we present [Formula: see text] , a seed-based alignment framework designed for estimating an optimal seed k-mer length (as well as a minimal number of shared seeds) based on a given alignment identity threshold. In particular, we were motivated to make [Formula: see text] more sensitive in the pairwise alignment of short sequences. RESULTS: The experimental results herein show improved alignments of short and divergent sequences when using the parameter values determined by [Formula: see text] in comparison to the default values of [Formula: see text]. We also show several cases of pairs of real divergent sequences, where the default parameter values of [Formula: see text] yield no output alignments, but the values output by [Formula: see text] produce plausible alignments. AVAILABILITY AND IMPLEMENTATION: https://github.com/lorrainea/Seedability (distributed under GPL v3.0). Oxford University Press 2023-08-12 /pmc/articles/PMC10444664/ /pubmed/37621456 http://dx.doi.org/10.1093/bioadv/vbad108 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Ayad, Lorraine A K Chikhi, Rayan Pissis, Solon P Seedability: optimizing alignment parameters for sensitive sequence comparison |
title | Seedability: optimizing alignment parameters for sensitive sequence comparison |
title_full | Seedability: optimizing alignment parameters for sensitive sequence comparison |
title_fullStr | Seedability: optimizing alignment parameters for sensitive sequence comparison |
title_full_unstemmed | Seedability: optimizing alignment parameters for sensitive sequence comparison |
title_short | Seedability: optimizing alignment parameters for sensitive sequence comparison |
title_sort | seedability: optimizing alignment parameters for sensitive sequence comparison |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444664/ https://www.ncbi.nlm.nih.gov/pubmed/37621456 http://dx.doi.org/10.1093/bioadv/vbad108 |
work_keys_str_mv | AT ayadlorraineak seedabilityoptimizingalignmentparametersforsensitivesequencecomparison AT chikhirayan seedabilityoptimizingalignmentparametersforsensitivesequencecomparison AT pississolonp seedabilityoptimizingalignmentparametersforsensitivesequencecomparison |