Cargando…

MotifClick: prediction of cis-regulatory binding sites via merging cliques

BACKGROUND: Although dozens of algorithms and tools have been developed to find a set of cis-regulatory binding sites called a motif in a set of intergenic sequences using various approaches, most of these tools focus on identifying binding sites that are significantly different from their backgroun...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Shaoqiang, Li, Shan, Niu, Meng, Pham, Phuc T, Su, Zhengchang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3225181/
https://www.ncbi.nlm.nih.gov/pubmed/21679436
http://dx.doi.org/10.1186/1471-2105-12-238
_version_ 1782217482564534272
author Zhang, Shaoqiang
Li, Shan
Niu, Meng
Pham, Phuc T
Su, Zhengchang
author_facet Zhang, Shaoqiang
Li, Shan
Niu, Meng
Pham, Phuc T
Su, Zhengchang
author_sort Zhang, Shaoqiang
collection PubMed
description BACKGROUND: Although dozens of algorithms and tools have been developed to find a set of cis-regulatory binding sites called a motif in a set of intergenic sequences using various approaches, most of these tools focus on identifying binding sites that are significantly different from their background sequences. However, some motifs may have a similar nucleotide distribution to that of their background sequences. Therefore, such binding sites can be missed by these tools. RESULTS: Here, we present a graph-based polynomial-time algorithm, MotifClick, for the prediction of cis-regulatory binding sites, in particular, those that have a similar nucleotide distribution to that of their background sequences. To find binding sites with length k, we construct a graph using some 2(k-1)-mers in the input sequences as the vertices, and connect two vertices by an edge if the maximum number of matches of the local gapless alignments between the two 2(k-1)-mers is greater than a cutoff value. We identify a motif as a set of similar k-mers from a merged group of maximum cliques associated with some vertices. CONCLUSIONS: When evaluated on both synthetic and real datasets of prokaryotes and eukaryotes, MotifClick outperforms existing leading motif-finding tools for prediction accuracy and balancing the prediction sensitivity and specificity in general. In particular, when the distribution of nucleotides of binding sites is similar to that of their background sequences, MotifClick is more likely to identify the binding sites than the other tools.
format Online
Article
Text
id pubmed-3225181
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32251812011-11-29 MotifClick: prediction of cis-regulatory binding sites via merging cliques Zhang, Shaoqiang Li, Shan Niu, Meng Pham, Phuc T Su, Zhengchang BMC Bioinformatics Methodology Article BACKGROUND: Although dozens of algorithms and tools have been developed to find a set of cis-regulatory binding sites called a motif in a set of intergenic sequences using various approaches, most of these tools focus on identifying binding sites that are significantly different from their background sequences. However, some motifs may have a similar nucleotide distribution to that of their background sequences. Therefore, such binding sites can be missed by these tools. RESULTS: Here, we present a graph-based polynomial-time algorithm, MotifClick, for the prediction of cis-regulatory binding sites, in particular, those that have a similar nucleotide distribution to that of their background sequences. To find binding sites with length k, we construct a graph using some 2(k-1)-mers in the input sequences as the vertices, and connect two vertices by an edge if the maximum number of matches of the local gapless alignments between the two 2(k-1)-mers is greater than a cutoff value. We identify a motif as a set of similar k-mers from a merged group of maximum cliques associated with some vertices. CONCLUSIONS: When evaluated on both synthetic and real datasets of prokaryotes and eukaryotes, MotifClick outperforms existing leading motif-finding tools for prediction accuracy and balancing the prediction sensitivity and specificity in general. In particular, when the distribution of nucleotides of binding sites is similar to that of their background sequences, MotifClick is more likely to identify the binding sites than the other tools. BioMed Central 2011-06-16 /pmc/articles/PMC3225181/ /pubmed/21679436 http://dx.doi.org/10.1186/1471-2105-12-238 Text en Copyright ©2011 Zhang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Zhang, Shaoqiang
Li, Shan
Niu, Meng
Pham, Phuc T
Su, Zhengchang
MotifClick: prediction of cis-regulatory binding sites via merging cliques
title MotifClick: prediction of cis-regulatory binding sites via merging cliques
title_full MotifClick: prediction of cis-regulatory binding sites via merging cliques
title_fullStr MotifClick: prediction of cis-regulatory binding sites via merging cliques
title_full_unstemmed MotifClick: prediction of cis-regulatory binding sites via merging cliques
title_short MotifClick: prediction of cis-regulatory binding sites via merging cliques
title_sort motifclick: prediction of cis-regulatory binding sites via merging cliques
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3225181/
https://www.ncbi.nlm.nih.gov/pubmed/21679436
http://dx.doi.org/10.1186/1471-2105-12-238
work_keys_str_mv AT zhangshaoqiang motifclickpredictionofcisregulatorybindingsitesviamergingcliques
AT lishan motifclickpredictionofcisregulatorybindingsitesviamergingcliques
AT niumeng motifclickpredictionofcisregulatorybindingsitesviamergingcliques
AT phamphuct motifclickpredictionofcisregulatorybindingsitesviamergingcliques
AT suzhengchang motifclickpredictionofcisregulatorybindingsitesviamergingcliques