Cargando…

A Systems Biology Approach to Transcription Factor Binding Site Prediction

BACKGROUND: The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Xiang, Sumazin, Pavel, Rajbhandari, Presha, Califano, Andrea
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845628/
https://www.ncbi.nlm.nih.gov/pubmed/20360861
http://dx.doi.org/10.1371/journal.pone.0009878
_version_ 1782179422822989824
author Zhou, Xiang
Sumazin, Pavel
Rajbhandari, Presha
Califano, Andrea
author_facet Zhou, Xiang
Sumazin, Pavel
Rajbhandari, Presha
Califano, Andrea
author_sort Zhou, Xiang
collection PubMed
description BACKGROUND: The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression profiles and cross-species conserved regulatory regions in DNA. Technical challenges faced by these methods include distinguishing between direct and indirect interactions, associating transcription regulators with predicted transcription factor binding sites (TFBSs), identifying non-linearly conserved binding sites across species, and providing realistic accuracy estimates. METHODOLOGY/PRINCIPAL FINDINGS: We address these challenges by closely integrating proven methods for regulatory network reverse engineering from mRNA expression data, linearly and non-linearly conserved regulatory region discovery, and TFBS evaluation and discovery. Using an extensive test set of high-likelihood interactions, which we collected in order to provide realistic prediction-accuracy estimates, we show that a careful integration of these methods leads to significant improvements in prediction accuracy. To verify our methods, we biochemically validated TFBS predictions made for both transcription factors (TFs) and co-factors; we validated binding site predictions made using a known E2F1 DNA-binding motif on E2F1 predicted promoter targets, known E2F1 and JUND motifs on JUND predicted promoter targets, and a de novo discovered motif for BCL6 on BCL6 predicted promoter targets. Finally, to demonstrate accuracy of prediction using an external dataset, we showed that sites matching predicted motifs for ZNF263 are significantly enriched in recent ZNF263 ChIP-seq data. CONCLUSIONS/SIGNIFICANCE: Using an integrative framework, we were able to address technical challenges faced by state of the art network reverse engineering methods, leading to significant improvement in direct-interaction detection and TFBS-discovery accuracy. We estimated the accuracy of our framework on a human B-cell specific test set, which may help guide future methodological development.
format Text
id pubmed-2845628
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28456282010-04-02 A Systems Biology Approach to Transcription Factor Binding Site Prediction Zhou, Xiang Sumazin, Pavel Rajbhandari, Presha Califano, Andrea PLoS One Research Article BACKGROUND: The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression profiles and cross-species conserved regulatory regions in DNA. Technical challenges faced by these methods include distinguishing between direct and indirect interactions, associating transcription regulators with predicted transcription factor binding sites (TFBSs), identifying non-linearly conserved binding sites across species, and providing realistic accuracy estimates. METHODOLOGY/PRINCIPAL FINDINGS: We address these challenges by closely integrating proven methods for regulatory network reverse engineering from mRNA expression data, linearly and non-linearly conserved regulatory region discovery, and TFBS evaluation and discovery. Using an extensive test set of high-likelihood interactions, which we collected in order to provide realistic prediction-accuracy estimates, we show that a careful integration of these methods leads to significant improvements in prediction accuracy. To verify our methods, we biochemically validated TFBS predictions made for both transcription factors (TFs) and co-factors; we validated binding site predictions made using a known E2F1 DNA-binding motif on E2F1 predicted promoter targets, known E2F1 and JUND motifs on JUND predicted promoter targets, and a de novo discovered motif for BCL6 on BCL6 predicted promoter targets. Finally, to demonstrate accuracy of prediction using an external dataset, we showed that sites matching predicted motifs for ZNF263 are significantly enriched in recent ZNF263 ChIP-seq data. CONCLUSIONS/SIGNIFICANCE: Using an integrative framework, we were able to address technical challenges faced by state of the art network reverse engineering methods, leading to significant improvement in direct-interaction detection and TFBS-discovery accuracy. We estimated the accuracy of our framework on a human B-cell specific test set, which may help guide future methodological development. Public Library of Science 2010-03-26 /pmc/articles/PMC2845628/ /pubmed/20360861 http://dx.doi.org/10.1371/journal.pone.0009878 Text en Zhou et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhou, Xiang
Sumazin, Pavel
Rajbhandari, Presha
Califano, Andrea
A Systems Biology Approach to Transcription Factor Binding Site Prediction
title A Systems Biology Approach to Transcription Factor Binding Site Prediction
title_full A Systems Biology Approach to Transcription Factor Binding Site Prediction
title_fullStr A Systems Biology Approach to Transcription Factor Binding Site Prediction
title_full_unstemmed A Systems Biology Approach to Transcription Factor Binding Site Prediction
title_short A Systems Biology Approach to Transcription Factor Binding Site Prediction
title_sort systems biology approach to transcription factor binding site prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845628/
https://www.ncbi.nlm.nih.gov/pubmed/20360861
http://dx.doi.org/10.1371/journal.pone.0009878
work_keys_str_mv AT zhouxiang asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT sumazinpavel asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT rajbhandaripresha asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT califanoandrea asystemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT zhouxiang systemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT sumazinpavel systemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT rajbhandaripresha systemsbiologyapproachtotranscriptionfactorbindingsiteprediction
AT califanoandrea systemsbiologyapproachtotranscriptionfactorbindingsiteprediction