Cargando…
Efficient sequential and parallel algorithms for planted motif search
BACKGROUND: Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want t...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3924400/ https://www.ncbi.nlm.nih.gov/pubmed/24479443 http://dx.doi.org/10.1186/1471-2105-15-34 |
_version_ | 1782303738779664384 |
---|---|
author | Nicolae, Marius Rajasekaran, Sanguthevar |
author_facet | Nicolae, Marius Rajasekaran, Sanguthevar |
author_sort | Nicolae, Marius |
collection | PubMed |
description | BACKGROUND: Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances. RESULTS: This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/. CONCLUSIONS: We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms. |
format | Online Article Text |
id | pubmed-3924400 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39244002014-03-03 Efficient sequential and parallel algorithms for planted motif search Nicolae, Marius Rajasekaran, Sanguthevar BMC Bioinformatics Research Article BACKGROUND: Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances. RESULTS: This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/. CONCLUSIONS: We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms. BioMed Central 2014-01-31 /pmc/articles/PMC3924400/ /pubmed/24479443 http://dx.doi.org/10.1186/1471-2105-15-34 Text en Copyright © 2014 Nicolae and Rajasekaran; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Nicolae, Marius Rajasekaran, Sanguthevar Efficient sequential and parallel algorithms for planted motif search |
title | Efficient sequential and parallel algorithms for planted motif search |
title_full | Efficient sequential and parallel algorithms for planted motif search |
title_fullStr | Efficient sequential and parallel algorithms for planted motif search |
title_full_unstemmed | Efficient sequential and parallel algorithms for planted motif search |
title_short | Efficient sequential and parallel algorithms for planted motif search |
title_sort | efficient sequential and parallel algorithms for planted motif search |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3924400/ https://www.ncbi.nlm.nih.gov/pubmed/24479443 http://dx.doi.org/10.1186/1471-2105-15-34 |
work_keys_str_mv | AT nicolaemarius efficientsequentialandparallelalgorithmsforplantedmotifsearch AT rajasekaransanguthevar efficientsequentialandparallelalgorithmsforplantedmotifsearch |