Cargando…

Efficient computation of spaced seeds

BACKGROUND: The most frequently used tools in bioinformatics are those searching for similarities, or local alignments, between biological sequences. Since the exact dynamic programming algorithm is quadratic, linear-time heuristics such as BLAST are used. Spaced seeds are much more sensitive than t...

Descripción completa

Detalles Bibliográficos
Autor principal: Ilie, Silvana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392737/
https://www.ncbi.nlm.nih.gov/pubmed/22373455
http://dx.doi.org/10.1186/1756-0500-5-123
_version_ 1782237635509485568
author Ilie, Silvana
author_facet Ilie, Silvana
author_sort Ilie, Silvana
collection PubMed
description BACKGROUND: The most frequently used tools in bioinformatics are those searching for similarities, or local alignments, between biological sequences. Since the exact dynamic programming algorithm is quadratic, linear-time heuristics such as BLAST are used. Spaced seeds are much more sensitive than the consecutive seed of BLAST and using several seeds represents the current state of the art in approximate search for biological sequences. The most important aspect is computing highly sensitive seeds. Since the problem seems hard, heuristic algorithms are used. The leading software in the common Bernoulli model is the SpEED program. FINDINGS: SpEED uses a hill climbing method based on the overlap complexity heuristic. We propose a new algorithm for this heuristic that improves its speed by over one order of magnitude. We use the new implementation to compute improved seeds for several software programs. We compute as well multiple seeds of the same weight as MegaBLAST, that greatly improve its sensitivity. CONCLUSION: Multiple spaced seeds are being successfully used in bioinformatics software programs. Enabling researchers to compute very fast high quality seeds will help expanding the range of their applications.
format Online
Article
Text
id pubmed-3392737
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33927372012-07-11 Efficient computation of spaced seeds Ilie, Silvana BMC Res Notes Technical Note BACKGROUND: The most frequently used tools in bioinformatics are those searching for similarities, or local alignments, between biological sequences. Since the exact dynamic programming algorithm is quadratic, linear-time heuristics such as BLAST are used. Spaced seeds are much more sensitive than the consecutive seed of BLAST and using several seeds represents the current state of the art in approximate search for biological sequences. The most important aspect is computing highly sensitive seeds. Since the problem seems hard, heuristic algorithms are used. The leading software in the common Bernoulli model is the SpEED program. FINDINGS: SpEED uses a hill climbing method based on the overlap complexity heuristic. We propose a new algorithm for this heuristic that improves its speed by over one order of magnitude. We use the new implementation to compute improved seeds for several software programs. We compute as well multiple seeds of the same weight as MegaBLAST, that greatly improve its sensitivity. CONCLUSION: Multiple spaced seeds are being successfully used in bioinformatics software programs. Enabling researchers to compute very fast high quality seeds will help expanding the range of their applications. BioMed Central 2012-02-28 /pmc/articles/PMC3392737/ /pubmed/22373455 http://dx.doi.org/10.1186/1756-0500-5-123 Text en Copyright ©2011 Ilie et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Ilie, Silvana
Efficient computation of spaced seeds
title Efficient computation of spaced seeds
title_full Efficient computation of spaced seeds
title_fullStr Efficient computation of spaced seeds
title_full_unstemmed Efficient computation of spaced seeds
title_short Efficient computation of spaced seeds
title_sort efficient computation of spaced seeds
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392737/
https://www.ncbi.nlm.nih.gov/pubmed/22373455
http://dx.doi.org/10.1186/1756-0500-5-123
work_keys_str_mv AT iliesilvana efficientcomputationofspacedseeds