Cargando…

Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes

In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP...

Descripción completa

Detalles Bibliográficos
Autores principales: Talebzadeh, Mohammad, Zare-Mirakabad, Fatemeh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3931712/
https://www.ncbi.nlm.nih.gov/pubmed/24586611
http://dx.doi.org/10.1371/journal.pone.0089226
_version_ 1782304701737336832
author Talebzadeh, Mohammad
Zare-Mirakabad, Fatemeh
author_facet Talebzadeh, Mohammad
Zare-Mirakabad, Fatemeh
author_sort Talebzadeh, Mohammad
collection PubMed
description In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, “modified nucleosomes neighboring” and “modified nucleosomes occupancy”, to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction.
format Online
Article
Text
id pubmed-3931712
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39317122014-02-25 Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes Talebzadeh, Mohammad Zare-Mirakabad, Fatemeh PLoS One Research Article In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, “modified nucleosomes neighboring” and “modified nucleosomes occupancy”, to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction. Public Library of Science 2014-02-21 /pmc/articles/PMC3931712/ /pubmed/24586611 http://dx.doi.org/10.1371/journal.pone.0089226 Text en © 2014 Talebzadeh, Zare-Mirakabad http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Talebzadeh, Mohammad
Zare-Mirakabad, Fatemeh
Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
title Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
title_full Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
title_fullStr Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
title_full_unstemmed Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
title_short Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
title_sort transcription factor binding sites prediction based on modified nucleosomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3931712/
https://www.ncbi.nlm.nih.gov/pubmed/24586611
http://dx.doi.org/10.1371/journal.pone.0089226
work_keys_str_mv AT talebzadehmohammad transcriptionfactorbindingsitespredictionbasedonmodifiednucleosomes
AT zaremirakabadfatemeh transcriptionfactorbindingsitespredictionbasedonmodifiednucleosomes