Cargando…

RecMotif: a novel fast algorithm for weak motif discovery

BACKGROUND: Weak motif discovery in DNA sequences is an important but unresolved problem in computational biology. Previous algorithms that aimed to solve the problem usually require a large amount of memory or execution time. In this paper, we proposed a fast and memory efficient algorithm, RecMoti...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, He Quan, Low, Malcolm Yoke Hean, Hsu, Wen Jing, Rajapakse, Jagath C
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3024859/
https://www.ncbi.nlm.nih.gov/pubmed/21172058
http://dx.doi.org/10.1186/1471-2105-11-S11-S8
_version_ 1782196821882306560
author Sun, He Quan
Low, Malcolm Yoke Hean
Hsu, Wen Jing
Rajapakse, Jagath C
author_facet Sun, He Quan
Low, Malcolm Yoke Hean
Hsu, Wen Jing
Rajapakse, Jagath C
author_sort Sun, He Quan
collection PubMed
description BACKGROUND: Weak motif discovery in DNA sequences is an important but unresolved problem in computational biology. Previous algorithms that aimed to solve the problem usually require a large amount of memory or execution time. In this paper, we proposed a fast and memory efficient algorithm, RecMotif, which guarantees to discover all motifs with specific (l, d) settings (where l is the motif length and d is the maximum number of mutations between a motif instance and the true motif). RESULTS: Comparisons with several recently proposed algorithms have shown that RecMotif is more scalable for handling longer and weaker motifs. For instance, it can solve the open challenge cases such as (40, 14) within 5 hours while the other algorithms compared failed due to either longer execution times or shortage of memory space. For real biological sequences, such as E.coli CRP, RecMotif is able to accurately discover the motif instances with (l, d) as (18, 6) in less than 1 second, which is faster than the other algorithms compared. CONCLUSIONS: RecMotif is a novel algorithm that requires only a space complexity of O(m(2)n) (where m is the number of sequences in the data and n is the length of the sequences).
format Text
id pubmed-3024859
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30248592011-01-22 RecMotif: a novel fast algorithm for weak motif discovery Sun, He Quan Low, Malcolm Yoke Hean Hsu, Wen Jing Rajapakse, Jagath C BMC Bioinformatics Research BACKGROUND: Weak motif discovery in DNA sequences is an important but unresolved problem in computational biology. Previous algorithms that aimed to solve the problem usually require a large amount of memory or execution time. In this paper, we proposed a fast and memory efficient algorithm, RecMotif, which guarantees to discover all motifs with specific (l, d) settings (where l is the motif length and d is the maximum number of mutations between a motif instance and the true motif). RESULTS: Comparisons with several recently proposed algorithms have shown that RecMotif is more scalable for handling longer and weaker motifs. For instance, it can solve the open challenge cases such as (40, 14) within 5 hours while the other algorithms compared failed due to either longer execution times or shortage of memory space. For real biological sequences, such as E.coli CRP, RecMotif is able to accurately discover the motif instances with (l, d) as (18, 6) in less than 1 second, which is faster than the other algorithms compared. CONCLUSIONS: RecMotif is a novel algorithm that requires only a space complexity of O(m(2)n) (where m is the number of sequences in the data and n is the length of the sequences). BioMed Central 2010-12-14 /pmc/articles/PMC3024859/ /pubmed/21172058 http://dx.doi.org/10.1186/1471-2105-11-S11-S8 Text en Copyright ©2010 Sun et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Sun, He Quan
Low, Malcolm Yoke Hean
Hsu, Wen Jing
Rajapakse, Jagath C
RecMotif: a novel fast algorithm for weak motif discovery
title RecMotif: a novel fast algorithm for weak motif discovery
title_full RecMotif: a novel fast algorithm for weak motif discovery
title_fullStr RecMotif: a novel fast algorithm for weak motif discovery
title_full_unstemmed RecMotif: a novel fast algorithm for weak motif discovery
title_short RecMotif: a novel fast algorithm for weak motif discovery
title_sort recmotif: a novel fast algorithm for weak motif discovery
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3024859/
https://www.ncbi.nlm.nih.gov/pubmed/21172058
http://dx.doi.org/10.1186/1471-2105-11-S11-S8
work_keys_str_mv AT sunhequan recmotifanovelfastalgorithmforweakmotifdiscovery
AT lowmalcolmyokehean recmotifanovelfastalgorithmforweakmotifdiscovery
AT hsuwenjing recmotifanovelfastalgorithmforweakmotifdiscovery
AT rajapaksejagathc recmotifanovelfastalgorithmforweakmotifdiscovery