Cargando…

Seedability: optimizing alignment parameters for sensitive sequence comparison

MOTIVATION: Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as [Formula: see text] , use a single value of k per sequencing technology, withou...

Descripción completa

Detalles Bibliográficos
Autores principales: Ayad, Lorraine A K, Chikhi, Rayan, Pissis, Solon P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444664/
https://www.ncbi.nlm.nih.gov/pubmed/37621456
http://dx.doi.org/10.1093/bioadv/vbad108
Descripción
Sumario:MOTIVATION: Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as [Formula: see text] , use a single value of k per sequencing technology, without a strong guarantee that this is the best possible value. Given the ubiquity of sequence alignment, identifying values of k that lead to more sensitive alignments is thus an important task. To aid this, we present [Formula: see text] , a seed-based alignment framework designed for estimating an optimal seed k-mer length (as well as a minimal number of shared seeds) based on a given alignment identity threshold. In particular, we were motivated to make [Formula: see text] more sensitive in the pairwise alignment of short sequences. RESULTS: The experimental results herein show improved alignments of short and divergent sequences when using the parameter values determined by [Formula: see text] in comparison to the default values of [Formula: see text]. We also show several cases of pairs of real divergent sequences, where the default parameter values of [Formula: see text] yield no output alignments, but the values output by [Formula: see text] produce plausible alignments. AVAILABILITY AND IMPLEMENTATION: https://github.com/lorrainea/Seedability (distributed under GPL v3.0).