Cargando…
A speedup technique for (l, d)-motif finding algorithms
BACKGROUND: The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3063805/ https://www.ncbi.nlm.nih.gov/pubmed/21385438 http://dx.doi.org/10.1186/1756-0500-4-54 |
_version_ | 1782200835873177600 |
---|---|
author | Rajasekaran, Sanguthevar Dinh, Hieu |
author_facet | Rajasekaran, Sanguthevar Dinh, Hieu |
author_sort | Rajasekaran, Sanguthevar |
collection | PubMed |
description | BACKGROUND: The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of genes, identification of intron/exon splicing sites, identification of SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have proven to be extremely helpful in domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, etc. Motifs are important patterns that are helpful in finding transcriptional regulatory elements, transcription factor binding sites, functional genomics, drug design, etc. As a result, numerous papers have been written to solve the motif search problem. RESULTS: Three versions of the motif search problem have been proposed in the literature: Simple Motif Search (SMS), (l, d)-motif search (or Planted Motif Search (PMS)), and Edit-distance-based Motif Search (EMS). In this paper we focus on PMS. Two kinds of algorithms can be found in the literature for solving the PMS problem: exact and approximate. An exact algorithm identifies the motifs always and an approximate algorithm may fail to identify some or all of the motifs. The exact version of PMS problem has been shown to be NP-hard. Exact algorithms proposed in the literature for PMS take time that is exponential in some of the underlying parameters. In this paper we propose a generic technique that can be used to speedup PMS algorithms. CONCLUSIONS: We present a speedup technique that can be used on any PMS algorithm. We have tested our speedup technique on a number of algorithms. These experimental results show that our speedup technique is indeed very effective. The implementation of algorithms is freely available on the web at http://www.engr.uconn.edu/rajasek/PMS4.zip |
format | Text |
id | pubmed-3063805 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30638052011-03-25 A speedup technique for (l, d)-motif finding algorithms Rajasekaran, Sanguthevar Dinh, Hieu BMC Res Notes Research Article BACKGROUND: The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of genes, identification of intron/exon splicing sites, identification of SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have proven to be extremely helpful in domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, etc. Motifs are important patterns that are helpful in finding transcriptional regulatory elements, transcription factor binding sites, functional genomics, drug design, etc. As a result, numerous papers have been written to solve the motif search problem. RESULTS: Three versions of the motif search problem have been proposed in the literature: Simple Motif Search (SMS), (l, d)-motif search (or Planted Motif Search (PMS)), and Edit-distance-based Motif Search (EMS). In this paper we focus on PMS. Two kinds of algorithms can be found in the literature for solving the PMS problem: exact and approximate. An exact algorithm identifies the motifs always and an approximate algorithm may fail to identify some or all of the motifs. The exact version of PMS problem has been shown to be NP-hard. Exact algorithms proposed in the literature for PMS take time that is exponential in some of the underlying parameters. In this paper we propose a generic technique that can be used to speedup PMS algorithms. CONCLUSIONS: We present a speedup technique that can be used on any PMS algorithm. We have tested our speedup technique on a number of algorithms. These experimental results show that our speedup technique is indeed very effective. The implementation of algorithms is freely available on the web at http://www.engr.uconn.edu/rajasek/PMS4.zip BioMed Central 2011-03-08 /pmc/articles/PMC3063805/ /pubmed/21385438 http://dx.doi.org/10.1186/1756-0500-4-54 Text en Copyright ©2011 Rajasekaran et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Rajasekaran, Sanguthevar Dinh, Hieu A speedup technique for (l, d)-motif finding algorithms |
title | A speedup technique for (l, d)-motif finding algorithms |
title_full | A speedup technique for (l, d)-motif finding algorithms |
title_fullStr | A speedup technique for (l, d)-motif finding algorithms |
title_full_unstemmed | A speedup technique for (l, d)-motif finding algorithms |
title_short | A speedup technique for (l, d)-motif finding algorithms |
title_sort | speedup technique for (l, d)-motif finding algorithms |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3063805/ https://www.ncbi.nlm.nih.gov/pubmed/21385438 http://dx.doi.org/10.1186/1756-0500-4-54 |
work_keys_str_mv | AT rajasekaransanguthevar aspeeduptechniqueforldmotiffindingalgorithms AT dinhhieu aspeeduptechniqueforldmotiffindingalgorithms AT rajasekaransanguthevar speeduptechniqueforldmotiffindingalgorithms AT dinhhieu speeduptechniqueforldmotiffindingalgorithms |