Cargando…
Swiftly Computing Center Strings
BACKGROUND: The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP co...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108310/ https://www.ncbi.nlm.nih.gov/pubmed/21504573 http://dx.doi.org/10.1186/1471-2105-12-106 |
_version_ | 1782205303691935744 |
---|---|
author | Hufsky, Franziska Kuchenbecker, Léon Jahn, Katharina Stoye, Jens Böcker, Sebastian |
author_facet | Hufsky, Franziska Kuchenbecker, Léon Jahn, Katharina Stoye, Jens Böcker, Sebastian |
author_sort | Hufsky, Franziska |
collection | PubMed |
description | BACKGROUND: The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP complete. RESULTS: In this paper, we focus on exact methods for the problem that are also swift in application. We first introduce data reduction techniques that allow us to infer that certain instances have no solution, or that a center string must satisfy certain conditions. We describe how to use this information to speed up two previously published search tree algorithms. Then, we describe a novel iterative search strategy that is effecient in practice, where some of our reduction techniques can also be applied. Finally, we present results of an evaluation study for two different data sets from a biological application. CONCLUSIONS: We find that the running time for computing the optimal center string is dominated by the subroutine calls for d = d(opt )-1 and d = d(opt). Our data reduction is very effective for both, either rejecting unsolvable instances or solving trivial positions. We find that this speeds up computations considerably. |
format | Online Article Text |
id | pubmed-3108310 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-31083102011-06-07 Swiftly Computing Center Strings Hufsky, Franziska Kuchenbecker, Léon Jahn, Katharina Stoye, Jens Böcker, Sebastian BMC Bioinformatics Research Article BACKGROUND: The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP complete. RESULTS: In this paper, we focus on exact methods for the problem that are also swift in application. We first introduce data reduction techniques that allow us to infer that certain instances have no solution, or that a center string must satisfy certain conditions. We describe how to use this information to speed up two previously published search tree algorithms. Then, we describe a novel iterative search strategy that is effecient in practice, where some of our reduction techniques can also be applied. Finally, we present results of an evaluation study for two different data sets from a biological application. CONCLUSIONS: We find that the running time for computing the optimal center string is dominated by the subroutine calls for d = d(opt )-1 and d = d(opt). Our data reduction is very effective for both, either rejecting unsolvable instances or solving trivial positions. We find that this speeds up computations considerably. BioMed Central 2011-04-19 /pmc/articles/PMC3108310/ /pubmed/21504573 http://dx.doi.org/10.1186/1471-2105-12-106 Text en Copyright ©2011 Hufsky et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Hufsky, Franziska Kuchenbecker, Léon Jahn, Katharina Stoye, Jens Böcker, Sebastian Swiftly Computing Center Strings |
title | Swiftly Computing Center Strings |
title_full | Swiftly Computing Center Strings |
title_fullStr | Swiftly Computing Center Strings |
title_full_unstemmed | Swiftly Computing Center Strings |
title_short | Swiftly Computing Center Strings |
title_sort | swiftly computing center strings |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108310/ https://www.ncbi.nlm.nih.gov/pubmed/21504573 http://dx.doi.org/10.1186/1471-2105-12-106 |
work_keys_str_mv | AT hufskyfranziska swiftlycomputingcenterstrings AT kuchenbeckerleon swiftlycomputingcenterstrings AT jahnkatharina swiftlycomputingcenterstrings AT stoyejens swiftlycomputingcenterstrings AT bockersebastian swiftlycomputingcenterstrings |