Cargando…

A combinatorial optimization approach for diverse motif finding applications

BACKGROUND: Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared funct...

Descripción completa

Detalles Bibliográficos
Autores principales: Zaslavsky, Elena, Singh, Mona
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1570465/
https://www.ncbi.nlm.nih.gov/pubmed/16916460
http://dx.doi.org/10.1186/1748-7188-1-13
_version_ 1782130269383294976
author Zaslavsky, Elena
Singh, Mona
author_facet Zaslavsky, Elena
Singh, Mona
author_sort Zaslavsky, Elena
collection PubMed
description BACKGROUND: Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied. RESULTS: We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem, including those incorporating substitution matrices and phylogenetic distances. Additionally, we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets, we demonstrate that our approach typically identifies statistically significant motifs corresponding to either known motifs or other motifs of high conservation. Moreover, in most cases, our approach finds provably optimal solutions to the underlying optimization problem. CONCLUSION: Our results demonstrate that a combined graph theoretic and mathematical programming approach can be the basis for effective and powerful techniques for diverse motif finding applications.
format Text
id pubmed-1570465
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15704652006-09-26 A combinatorial optimization approach for diverse motif finding applications Zaslavsky, Elena Singh, Mona Algorithms Mol Biol Research BACKGROUND: Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied. RESULTS: We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem, including those incorporating substitution matrices and phylogenetic distances. Additionally, we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets, we demonstrate that our approach typically identifies statistically significant motifs corresponding to either known motifs or other motifs of high conservation. Moreover, in most cases, our approach finds provably optimal solutions to the underlying optimization problem. CONCLUSION: Our results demonstrate that a combined graph theoretic and mathematical programming approach can be the basis for effective and powerful techniques for diverse motif finding applications. BioMed Central 2006-08-17 /pmc/articles/PMC1570465/ /pubmed/16916460 http://dx.doi.org/10.1186/1748-7188-1-13 Text en Copyright © 2006 Zaslavsky and Singh; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Zaslavsky, Elena
Singh, Mona
A combinatorial optimization approach for diverse motif finding applications
title A combinatorial optimization approach for diverse motif finding applications
title_full A combinatorial optimization approach for diverse motif finding applications
title_fullStr A combinatorial optimization approach for diverse motif finding applications
title_full_unstemmed A combinatorial optimization approach for diverse motif finding applications
title_short A combinatorial optimization approach for diverse motif finding applications
title_sort combinatorial optimization approach for diverse motif finding applications
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1570465/
https://www.ncbi.nlm.nih.gov/pubmed/16916460
http://dx.doi.org/10.1186/1748-7188-1-13
work_keys_str_mv AT zaslavskyelena acombinatorialoptimizationapproachfordiversemotiffindingapplications
AT singhmona acombinatorialoptimizationapproachfordiversemotiffindingapplications
AT zaslavskyelena combinatorialoptimizationapproachfordiversemotiffindingapplications
AT singhmona combinatorialoptimizationapproachfordiversemotiffindingapplications