Cargando…

A speedup technique for (l, d)-motif finding algorithms

BACKGROUND: The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajasekaran, Sanguthevar, Dinh, Hieu
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3063805/
https://www.ncbi.nlm.nih.gov/pubmed/21385438
http://dx.doi.org/10.1186/1756-0500-4-54
_version_ 1782200835873177600
author Rajasekaran, Sanguthevar
Dinh, Hieu
author_facet Rajasekaran, Sanguthevar
Dinh, Hieu
author_sort Rajasekaran, Sanguthevar
collection PubMed
description BACKGROUND: The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of genes, identification of intron/exon splicing sites, identification of SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have proven to be extremely helpful in domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, etc. Motifs are important patterns that are helpful in finding transcriptional regulatory elements, transcription factor binding sites, functional genomics, drug design, etc. As a result, numerous papers have been written to solve the motif search problem. RESULTS: Three versions of the motif search problem have been proposed in the literature: Simple Motif Search (SMS), (l, d)-motif search (or Planted Motif Search (PMS)), and Edit-distance-based Motif Search (EMS). In this paper we focus on PMS. Two kinds of algorithms can be found in the literature for solving the PMS problem: exact and approximate. An exact algorithm identifies the motifs always and an approximate algorithm may fail to identify some or all of the motifs. The exact version of PMS problem has been shown to be NP-hard. Exact algorithms proposed in the literature for PMS take time that is exponential in some of the underlying parameters. In this paper we propose a generic technique that can be used to speedup PMS algorithms. CONCLUSIONS: We present a speedup technique that can be used on any PMS algorithm. We have tested our speedup technique on a number of algorithms. These experimental results show that our speedup technique is indeed very effective. The implementation of algorithms is freely available on the web at http://www.engr.uconn.edu/rajasek/PMS4.zip
format Text
id pubmed-3063805
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30638052011-03-25 A speedup technique for (l, d)-motif finding algorithms Rajasekaran, Sanguthevar Dinh, Hieu BMC Res Notes Research Article BACKGROUND: The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of genes, identification of intron/exon splicing sites, identification of SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have proven to be extremely helpful in domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, etc. Motifs are important patterns that are helpful in finding transcriptional regulatory elements, transcription factor binding sites, functional genomics, drug design, etc. As a result, numerous papers have been written to solve the motif search problem. RESULTS: Three versions of the motif search problem have been proposed in the literature: Simple Motif Search (SMS), (l, d)-motif search (or Planted Motif Search (PMS)), and Edit-distance-based Motif Search (EMS). In this paper we focus on PMS. Two kinds of algorithms can be found in the literature for solving the PMS problem: exact and approximate. An exact algorithm identifies the motifs always and an approximate algorithm may fail to identify some or all of the motifs. The exact version of PMS problem has been shown to be NP-hard. Exact algorithms proposed in the literature for PMS take time that is exponential in some of the underlying parameters. In this paper we propose a generic technique that can be used to speedup PMS algorithms. CONCLUSIONS: We present a speedup technique that can be used on any PMS algorithm. We have tested our speedup technique on a number of algorithms. These experimental results show that our speedup technique is indeed very effective. The implementation of algorithms is freely available on the web at http://www.engr.uconn.edu/rajasek/PMS4.zip BioMed Central 2011-03-08 /pmc/articles/PMC3063805/ /pubmed/21385438 http://dx.doi.org/10.1186/1756-0500-4-54 Text en Copyright ©2011 Rajasekaran et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Rajasekaran, Sanguthevar
Dinh, Hieu
A speedup technique for (l, d)-motif finding algorithms
title A speedup technique for (l, d)-motif finding algorithms
title_full A speedup technique for (l, d)-motif finding algorithms
title_fullStr A speedup technique for (l, d)-motif finding algorithms
title_full_unstemmed A speedup technique for (l, d)-motif finding algorithms
title_short A speedup technique for (l, d)-motif finding algorithms
title_sort speedup technique for (l, d)-motif finding algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3063805/
https://www.ncbi.nlm.nih.gov/pubmed/21385438
http://dx.doi.org/10.1186/1756-0500-4-54
work_keys_str_mv AT rajasekaransanguthevar aspeeduptechniqueforldmotiffindingalgorithms
AT dinhhieu aspeeduptechniqueforldmotiffindingalgorithms
AT rajasekaransanguthevar speeduptechniqueforldmotiffindingalgorithms
AT dinhhieu speeduptechniqueforldmotiffindingalgorithms