Cargando…

Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes

Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Shaoqiang, Xu, Minli, Li, Shan, Su, Zhengchang
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2691844/
https://www.ncbi.nlm.nih.gov/pubmed/19383880
http://dx.doi.org/10.1093/nar/gkp248
_version_ 1782167911214874624
author Zhang, Shaoqiang
Xu, Minli
Li, Shan
Su, Zhengchang
author_facet Zhang, Shaoqiang
Xu, Minli
Li, Shan
Su, Zhengchang
author_sort Zhang, Shaoqiang
collection PubMed
description Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity.
format Text
id pubmed-2691844
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-26918442009-07-17 Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes Zhang, Shaoqiang Xu, Minli Li, Shan Su, Zhengchang Nucleic Acids Res Methods Online Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity. Oxford University Press 2009-06 2009-04-21 /pmc/articles/PMC2691844/ /pubmed/19383880 http://dx.doi.org/10.1093/nar/gkp248 Text en © 2009 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Zhang, Shaoqiang
Xu, Minli
Li, Shan
Su, Zhengchang
Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
title Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
title_full Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
title_fullStr Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
title_full_unstemmed Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
title_short Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
title_sort genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2691844/
https://www.ncbi.nlm.nih.gov/pubmed/19383880
http://dx.doi.org/10.1093/nar/gkp248
work_keys_str_mv AT zhangshaoqiang genomewidedenovopredictionofcisregulatorybindingsitesinprokaryotes
AT xuminli genomewidedenovopredictionofcisregulatorybindingsitesinprokaryotes
AT lishan genomewidedenovopredictionofcisregulatorybindingsitesinprokaryotes
AT suzhengchang genomewidedenovopredictionofcisregulatorybindingsitesinprokaryotes