Cargando…

Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites

Computational prediction of nucleotide binding specificity for transcription factors remains a fundamental and largely unsolved problem. Determination of binding positions is a prerequisite for research in gene regulation, a major mechanism controlling phenotypic diversity. Furthermore, an accurate...

Descripción completa

Detalles Bibliográficos
Autores principales: Reddy, Timothy E, DeLisi, Charles, Shakhnovich, Boris E
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1866359/
https://www.ncbi.nlm.nih.gov/pubmed/17500587
http://dx.doi.org/10.1371/journal.pcbi.0030090
_version_ 1782133259160780800
author Reddy, Timothy E
DeLisi, Charles
Shakhnovich, Boris E
author_facet Reddy, Timothy E
DeLisi, Charles
Shakhnovich, Boris E
author_sort Reddy, Timothy E
collection PubMed
description Computational prediction of nucleotide binding specificity for transcription factors remains a fundamental and largely unsolved problem. Determination of binding positions is a prerequisite for research in gene regulation, a major mechanism controlling phenotypic diversity. Furthermore, an accurate determination of binding specificities from high-throughput data sources is necessary to realize the full potential of systems biology. Unfortunately, recently performed independent evaluation showed that more than half the predictions from most widely used algorithms are false. We introduce a graph-theoretical framework to describe local sequence similarity as the pair-wise distances between nucleotides in promoter sequences, and hypothesize that densely connected subgraphs are indicative of transcription factor binding sites. Using a well-established sampling algorithm coupled with simple clustering and scoring schemes, we identify sets of closely related nucleotides and test those for known TF binding activity. Using an independent benchmark, we find our algorithm predicts yeast binding motifs considerably better than currently available techniques and without manual curation. Importantly, we reduce the number of false positive predictions in yeast to less than 30%. We also develop a framework to evaluate the statistical significance of our motif predictions. We show that our approach is robust to the choice of input promoters, and thus can be used in the context of predicting binding positions from noisy experimental data. We apply our method to identify binding sites using data from genome scale ChIP–chip experiments. Results from these experiments are publicly available at http://cagt10.bu.edu/BSG. The graphical framework developed here may be useful when combining predictions from numerous computational and experimental measures. Finally, we discuss how our algorithm can be used to improve the sensitivity of computational predictions of transcription factor binding specificities.
format Text
id pubmed-1866359
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-18663592007-05-11 Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites Reddy, Timothy E DeLisi, Charles Shakhnovich, Boris E PLoS Comput Biol Research Article Computational prediction of nucleotide binding specificity for transcription factors remains a fundamental and largely unsolved problem. Determination of binding positions is a prerequisite for research in gene regulation, a major mechanism controlling phenotypic diversity. Furthermore, an accurate determination of binding specificities from high-throughput data sources is necessary to realize the full potential of systems biology. Unfortunately, recently performed independent evaluation showed that more than half the predictions from most widely used algorithms are false. We introduce a graph-theoretical framework to describe local sequence similarity as the pair-wise distances between nucleotides in promoter sequences, and hypothesize that densely connected subgraphs are indicative of transcription factor binding sites. Using a well-established sampling algorithm coupled with simple clustering and scoring schemes, we identify sets of closely related nucleotides and test those for known TF binding activity. Using an independent benchmark, we find our algorithm predicts yeast binding motifs considerably better than currently available techniques and without manual curation. Importantly, we reduce the number of false positive predictions in yeast to less than 30%. We also develop a framework to evaluate the statistical significance of our motif predictions. We show that our approach is robust to the choice of input promoters, and thus can be used in the context of predicting binding positions from noisy experimental data. We apply our method to identify binding sites using data from genome scale ChIP–chip experiments. Results from these experiments are publicly available at http://cagt10.bu.edu/BSG. The graphical framework developed here may be useful when combining predictions from numerous computational and experimental measures. Finally, we discuss how our algorithm can be used to improve the sensitivity of computational predictions of transcription factor binding specificities. Public Library of Science 2007-05 2007-05-11 /pmc/articles/PMC1866359/ /pubmed/17500587 http://dx.doi.org/10.1371/journal.pcbi.0030090 Text en © 2007 Reddy et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Reddy, Timothy E
DeLisi, Charles
Shakhnovich, Boris E
Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites
title Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites
title_full Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites
title_fullStr Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites
title_full_unstemmed Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites
title_short Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites
title_sort binding site graphs: a new graph theoretical framework for prediction of transcription factor binding sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1866359/
https://www.ncbi.nlm.nih.gov/pubmed/17500587
http://dx.doi.org/10.1371/journal.pcbi.0030090
work_keys_str_mv AT reddytimothye bindingsitegraphsanewgraphtheoreticalframeworkforpredictionoftranscriptionfactorbindingsites
AT delisicharles bindingsitegraphsanewgraphtheoreticalframeworkforpredictionoftranscriptionfactorbindingsites
AT shakhnovichborise bindingsitegraphsanewgraphtheoreticalframeworkforpredictionoftranscriptionfactorbindingsites