Cargando…

Swiftly Computing Center Strings

BACKGROUND: The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP co...

Descripción completa

Detalles Bibliográficos
Autores principales: Hufsky, Franziska, Kuchenbecker, Léon, Jahn, Katharina, Stoye, Jens, Böcker, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108310/
https://www.ncbi.nlm.nih.gov/pubmed/21504573
http://dx.doi.org/10.1186/1471-2105-12-106
_version_ 1782205303691935744
author Hufsky, Franziska
Kuchenbecker, Léon
Jahn, Katharina
Stoye, Jens
Böcker, Sebastian
author_facet Hufsky, Franziska
Kuchenbecker, Léon
Jahn, Katharina
Stoye, Jens
Böcker, Sebastian
author_sort Hufsky, Franziska
collection PubMed
description BACKGROUND: The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP complete. RESULTS: In this paper, we focus on exact methods for the problem that are also swift in application. We first introduce data reduction techniques that allow us to infer that certain instances have no solution, or that a center string must satisfy certain conditions. We describe how to use this information to speed up two previously published search tree algorithms. Then, we describe a novel iterative search strategy that is effecient in practice, where some of our reduction techniques can also be applied. Finally, we present results of an evaluation study for two different data sets from a biological application. CONCLUSIONS: We find that the running time for computing the optimal center string is dominated by the subroutine calls for d = d(opt )-1 and d = d(opt). Our data reduction is very effective for both, either rejecting unsolvable instances or solving trivial positions. We find that this speeds up computations considerably.
format Online
Article
Text
id pubmed-3108310
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31083102011-06-07 Swiftly Computing Center Strings Hufsky, Franziska Kuchenbecker, Léon Jahn, Katharina Stoye, Jens Böcker, Sebastian BMC Bioinformatics Research Article BACKGROUND: The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP complete. RESULTS: In this paper, we focus on exact methods for the problem that are also swift in application. We first introduce data reduction techniques that allow us to infer that certain instances have no solution, or that a center string must satisfy certain conditions. We describe how to use this information to speed up two previously published search tree algorithms. Then, we describe a novel iterative search strategy that is effecient in practice, where some of our reduction techniques can also be applied. Finally, we present results of an evaluation study for two different data sets from a biological application. CONCLUSIONS: We find that the running time for computing the optimal center string is dominated by the subroutine calls for d = d(opt )-1 and d = d(opt). Our data reduction is very effective for both, either rejecting unsolvable instances or solving trivial positions. We find that this speeds up computations considerably. BioMed Central 2011-04-19 /pmc/articles/PMC3108310/ /pubmed/21504573 http://dx.doi.org/10.1186/1471-2105-12-106 Text en Copyright ©2011 Hufsky et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Hufsky, Franziska
Kuchenbecker, Léon
Jahn, Katharina
Stoye, Jens
Böcker, Sebastian
Swiftly Computing Center Strings
title Swiftly Computing Center Strings
title_full Swiftly Computing Center Strings
title_fullStr Swiftly Computing Center Strings
title_full_unstemmed Swiftly Computing Center Strings
title_short Swiftly Computing Center Strings
title_sort swiftly computing center strings
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108310/
https://www.ncbi.nlm.nih.gov/pubmed/21504573
http://dx.doi.org/10.1186/1471-2105-12-106
work_keys_str_mv AT hufskyfranziska swiftlycomputingcenterstrings
AT kuchenbeckerleon swiftlycomputingcenterstrings
AT jahnkatharina swiftlycomputingcenterstrings
AT stoyejens swiftlycomputingcenterstrings
AT bockersebastian swiftlycomputingcenterstrings