Cargando…
Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3931712/ https://www.ncbi.nlm.nih.gov/pubmed/24586611 http://dx.doi.org/10.1371/journal.pone.0089226 |
_version_ | 1782304701737336832 |
---|---|
author | Talebzadeh, Mohammad Zare-Mirakabad, Fatemeh |
author_facet | Talebzadeh, Mohammad Zare-Mirakabad, Fatemeh |
author_sort | Talebzadeh, Mohammad |
collection | PubMed |
description | In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, “modified nucleosomes neighboring” and “modified nucleosomes occupancy”, to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction. |
format | Online Article Text |
id | pubmed-3931712 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39317122014-02-25 Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes Talebzadeh, Mohammad Zare-Mirakabad, Fatemeh PLoS One Research Article In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, “modified nucleosomes neighboring” and “modified nucleosomes occupancy”, to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction. Public Library of Science 2014-02-21 /pmc/articles/PMC3931712/ /pubmed/24586611 http://dx.doi.org/10.1371/journal.pone.0089226 Text en © 2014 Talebzadeh, Zare-Mirakabad http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Talebzadeh, Mohammad Zare-Mirakabad, Fatemeh Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes |
title | Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes |
title_full | Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes |
title_fullStr | Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes |
title_full_unstemmed | Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes |
title_short | Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes |
title_sort | transcription factor binding sites prediction based on modified nucleosomes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3931712/ https://www.ncbi.nlm.nih.gov/pubmed/24586611 http://dx.doi.org/10.1371/journal.pone.0089226 |
work_keys_str_mv | AT talebzadehmohammad transcriptionfactorbindingsitespredictionbasedonmodifiednucleosomes AT zaremirakabadfatemeh transcriptionfactorbindingsitespredictionbasedonmodifiednucleosomes |