Cargando…

A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data

BACKGROUND: How transcription factors (TFs) interact with cis-regulatory sequences and interact with each other is a fundamental, but not well understood, aspect of gene regulation. METHODOLOGY/PRINCIPAL FINDINGS: We present a computational method to address this question, relying on the established...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Xin, Chen, Chieh-Chun, Hong, Feng, Fang, Fang, Sinha, Saurabh, Ng, Huck-Hui, Zhong, Sheng
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2780727/
https://www.ncbi.nlm.nih.gov/pubmed/19956545
http://dx.doi.org/10.1371/journal.pone.0008155
_version_ 1782174525878697984
author He, Xin
Chen, Chieh-Chun
Hong, Feng
Fang, Fang
Sinha, Saurabh
Ng, Huck-Hui
Zhong, Sheng
author_facet He, Xin
Chen, Chieh-Chun
Hong, Feng
Fang, Fang
Sinha, Saurabh
Ng, Huck-Hui
Zhong, Sheng
author_sort He, Xin
collection PubMed
description BACKGROUND: How transcription factors (TFs) interact with cis-regulatory sequences and interact with each other is a fundamental, but not well understood, aspect of gene regulation. METHODOLOGY/PRINCIPAL FINDINGS: We present a computational method to address this question, relying on the established biophysical principles. This method, STAP (sequence to affinity prediction), takes into account all combinations and configurations of strong and weak binding sites to analyze large scale transcription factor (TF)-DNA binding data to discover cooperative interactions among TFs, infer sequence rules of interaction and predict TF target genes in new conditions with no TF-DNA binding data. The distinctions between STAP and other statistical approaches for analyzing cis-regulatory sequences include the utility of physical principles and the treatment of the DNA binding data as quantitative representation of binding strengths. Applying this method to the ChIP-seq data of 12 TFs in mouse embryonic stem (ES) cells, we found that the strength of TF-DNA binding could be significantly modulated by cooperative interactions among TFs with adjacent binding sites. However, further analysis on five putatively interacting TF pairs suggests that such interactions may be relatively insensitive to the distance and orientation of binding sites. Testing a set of putative Nanog motifs, STAP showed that a novel Nanog motif could better explain the ChIP-seq data than previously published ones. We then experimentally tested and verified the new Nanog motif. A series of comparisons showed that STAP has more predictive power than several state-of-the-art methods for cis-regulatory sequence analysis. We took advantage of this power to study the evolution of TF-target relationship in Drosophila. By learning the TF-DNA interaction models from the ChIP-chip data of D. melanogaster (Mel) and applying them to the genome of D. pseudoobscura (Pse), we found that only about half of the sequences strongly bound by TFs in Mel have high binding affinities in Pse. We show that prediction of functional TF targets from ChIP-chip data can be improved by using the conservation of STAP predicted affinities as an additional filter. CONCLUSIONS/SIGNIFICANCE: STAP is an effective method to analyze binding site arrangements, TF cooperativity, and TF target genes from genome-wide TF-DNA binding data.
format Text
id pubmed-2780727
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27807272009-12-03 A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data He, Xin Chen, Chieh-Chun Hong, Feng Fang, Fang Sinha, Saurabh Ng, Huck-Hui Zhong, Sheng PLoS One Research Article BACKGROUND: How transcription factors (TFs) interact with cis-regulatory sequences and interact with each other is a fundamental, but not well understood, aspect of gene regulation. METHODOLOGY/PRINCIPAL FINDINGS: We present a computational method to address this question, relying on the established biophysical principles. This method, STAP (sequence to affinity prediction), takes into account all combinations and configurations of strong and weak binding sites to analyze large scale transcription factor (TF)-DNA binding data to discover cooperative interactions among TFs, infer sequence rules of interaction and predict TF target genes in new conditions with no TF-DNA binding data. The distinctions between STAP and other statistical approaches for analyzing cis-regulatory sequences include the utility of physical principles and the treatment of the DNA binding data as quantitative representation of binding strengths. Applying this method to the ChIP-seq data of 12 TFs in mouse embryonic stem (ES) cells, we found that the strength of TF-DNA binding could be significantly modulated by cooperative interactions among TFs with adjacent binding sites. However, further analysis on five putatively interacting TF pairs suggests that such interactions may be relatively insensitive to the distance and orientation of binding sites. Testing a set of putative Nanog motifs, STAP showed that a novel Nanog motif could better explain the ChIP-seq data than previously published ones. We then experimentally tested and verified the new Nanog motif. A series of comparisons showed that STAP has more predictive power than several state-of-the-art methods for cis-regulatory sequence analysis. We took advantage of this power to study the evolution of TF-target relationship in Drosophila. By learning the TF-DNA interaction models from the ChIP-chip data of D. melanogaster (Mel) and applying them to the genome of D. pseudoobscura (Pse), we found that only about half of the sequences strongly bound by TFs in Mel have high binding affinities in Pse. We show that prediction of functional TF targets from ChIP-chip data can be improved by using the conservation of STAP predicted affinities as an additional filter. CONCLUSIONS/SIGNIFICANCE: STAP is an effective method to analyze binding site arrangements, TF cooperativity, and TF target genes from genome-wide TF-DNA binding data. Public Library of Science 2009-12-01 /pmc/articles/PMC2780727/ /pubmed/19956545 http://dx.doi.org/10.1371/journal.pone.0008155 Text en He et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
He, Xin
Chen, Chieh-Chun
Hong, Feng
Fang, Fang
Sinha, Saurabh
Ng, Huck-Hui
Zhong, Sheng
A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data
title A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data
title_full A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data
title_fullStr A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data
title_full_unstemmed A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data
title_short A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data
title_sort biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2780727/
https://www.ncbi.nlm.nih.gov/pubmed/19956545
http://dx.doi.org/10.1371/journal.pone.0008155
work_keys_str_mv AT hexin abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT chenchiehchun abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT hongfeng abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT fangfang abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT sinhasaurabh abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT nghuckhui abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT zhongsheng abiophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT hexin biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT chenchiehchun biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT hongfeng biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT fangfang biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT sinhasaurabh biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT nghuckhui biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata
AT zhongsheng biophysicalmodelforanalysisoftranscriptionfactorinteractionandbindingsitearrangementfromgenomewidebindingdata