Cargando…
A Systems Biology Approach to Transcription Factor Binding Site Prediction
BACKGROUND: The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression pr...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845628/ https://www.ncbi.nlm.nih.gov/pubmed/20360861 http://dx.doi.org/10.1371/journal.pone.0009878 |
_version_ | 1782179422822989824 |
---|---|
author | Zhou, Xiang Sumazin, Pavel Rajbhandari, Presha Califano, Andrea |
author_facet | Zhou, Xiang Sumazin, Pavel Rajbhandari, Presha Califano, Andrea |
author_sort | Zhou, Xiang |
collection | PubMed |
description | BACKGROUND: The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression profiles and cross-species conserved regulatory regions in DNA. Technical challenges faced by these methods include distinguishing between direct and indirect interactions, associating transcription regulators with predicted transcription factor binding sites (TFBSs), identifying non-linearly conserved binding sites across species, and providing realistic accuracy estimates. METHODOLOGY/PRINCIPAL FINDINGS: We address these challenges by closely integrating proven methods for regulatory network reverse engineering from mRNA expression data, linearly and non-linearly conserved regulatory region discovery, and TFBS evaluation and discovery. Using an extensive test set of high-likelihood interactions, which we collected in order to provide realistic prediction-accuracy estimates, we show that a careful integration of these methods leads to significant improvements in prediction accuracy. To verify our methods, we biochemically validated TFBS predictions made for both transcription factors (TFs) and co-factors; we validated binding site predictions made using a known E2F1 DNA-binding motif on E2F1 predicted promoter targets, known E2F1 and JUND motifs on JUND predicted promoter targets, and a de novo discovered motif for BCL6 on BCL6 predicted promoter targets. Finally, to demonstrate accuracy of prediction using an external dataset, we showed that sites matching predicted motifs for ZNF263 are significantly enriched in recent ZNF263 ChIP-seq data. CONCLUSIONS/SIGNIFICANCE: Using an integrative framework, we were able to address technical challenges faced by state of the art network reverse engineering methods, leading to significant improvement in direct-interaction detection and TFBS-discovery accuracy. We estimated the accuracy of our framework on a human B-cell specific test set, which may help guide future methodological development. |
format | Text |
id | pubmed-2845628 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-28456282010-04-02 A Systems Biology Approach to Transcription Factor Binding Site Prediction Zhou, Xiang Sumazin, Pavel Rajbhandari, Presha Califano, Andrea PLoS One Research Article BACKGROUND: The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression profiles and cross-species conserved regulatory regions in DNA. Technical challenges faced by these methods include distinguishing between direct and indirect interactions, associating transcription regulators with predicted transcription factor binding sites (TFBSs), identifying non-linearly conserved binding sites across species, and providing realistic accuracy estimates. METHODOLOGY/PRINCIPAL FINDINGS: We address these challenges by closely integrating proven methods for regulatory network reverse engineering from mRNA expression data, linearly and non-linearly conserved regulatory region discovery, and TFBS evaluation and discovery. Using an extensive test set of high-likelihood interactions, which we collected in order to provide realistic prediction-accuracy estimates, we show that a careful integration of these methods leads to significant improvements in prediction accuracy. To verify our methods, we biochemically validated TFBS predictions made for both transcription factors (TFs) and co-factors; we validated binding site predictions made using a known E2F1 DNA-binding motif on E2F1 predicted promoter targets, known E2F1 and JUND motifs on JUND predicted promoter targets, and a de novo discovered motif for BCL6 on BCL6 predicted promoter targets. Finally, to demonstrate accuracy of prediction using an external dataset, we showed that sites matching predicted motifs for ZNF263 are significantly enriched in recent ZNF263 ChIP-seq data. CONCLUSIONS/SIGNIFICANCE: Using an integrative framework, we were able to address technical challenges faced by state of the art network reverse engineering methods, leading to significant improvement in direct-interaction detection and TFBS-discovery accuracy. We estimated the accuracy of our framework on a human B-cell specific test set, which may help guide future methodological development. Public Library of Science 2010-03-26 /pmc/articles/PMC2845628/ /pubmed/20360861 http://dx.doi.org/10.1371/journal.pone.0009878 Text en Zhou et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Zhou, Xiang Sumazin, Pavel Rajbhandari, Presha Califano, Andrea A Systems Biology Approach to Transcription Factor Binding Site Prediction |
title | A Systems Biology Approach to Transcription Factor Binding Site Prediction |
title_full | A Systems Biology Approach to Transcription Factor Binding Site Prediction |
title_fullStr | A Systems Biology Approach to Transcription Factor Binding Site Prediction |
title_full_unstemmed | A Systems Biology Approach to Transcription Factor Binding Site Prediction |
title_short | A Systems Biology Approach to Transcription Factor Binding Site Prediction |
title_sort | systems biology approach to transcription factor binding site prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845628/ https://www.ncbi.nlm.nih.gov/pubmed/20360861 http://dx.doi.org/10.1371/journal.pone.0009878 |
work_keys_str_mv | AT zhouxiang asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT sumazinpavel asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT rajbhandaripresha asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT califanoandrea asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT zhouxiang systemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT sumazinpavel systemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT rajbhandaripresha systemsbiologyapproachtotranscriptionfactorbindingsiteprediction AT califanoandrea systemsbiologyapproachtotranscriptionfactorbindingsiteprediction |