Cargando…
Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast
Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictiv...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4546298/ https://www.ncbi.nlm.nih.gov/pubmed/26291518 http://dx.doi.org/10.1371/journal.pcbi.1004418 |
_version_ | 1782386897908137984 |
---|---|
author | Tsai, Zing Tsung-Yeh Shiu, Shin-Han Tsai, Huai-Kuang |
author_facet | Tsai, Zing Tsung-Yeh Shiu, Shin-Han Tsai, Huai-Kuang |
author_sort | Tsai, Zing Tsung-Yeh |
collection | PubMed |
description | Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA “intrinsic properties” (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome. |
format | Online Article Text |
id | pubmed-4546298 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-45462982015-08-26 Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast Tsai, Zing Tsung-Yeh Shiu, Shin-Han Tsai, Huai-Kuang PLoS Comput Biol Research Article Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA “intrinsic properties” (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome. Public Library of Science 2015-08-20 /pmc/articles/PMC4546298/ /pubmed/26291518 http://dx.doi.org/10.1371/journal.pcbi.1004418 Text en © 2015 Tsai et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Tsai, Zing Tsung-Yeh Shiu, Shin-Han Tsai, Huai-Kuang Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast |
title | Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast |
title_full | Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast |
title_fullStr | Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast |
title_full_unstemmed | Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast |
title_short | Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast |
title_sort | contribution of sequence motif, chromatin state, and dna structure features to predictive models of transcription factor binding in yeast |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4546298/ https://www.ncbi.nlm.nih.gov/pubmed/26291518 http://dx.doi.org/10.1371/journal.pcbi.1004418 |
work_keys_str_mv | AT tsaizingtsungyeh contributionofsequencemotifchromatinstateanddnastructurefeaturestopredictivemodelsoftranscriptionfactorbindinginyeast AT shiushinhan contributionofsequencemotifchromatinstateanddnastructurefeaturestopredictivemodelsoftranscriptionfactorbindinginyeast AT tsaihuaikuang contributionofsequencemotifchromatinstateanddnastructurefeaturestopredictivemodelsoftranscriptionfactorbindinginyeast |