Cargando…

Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites

BACKGROUND: The binding of transcription factors to their respective DNA sites is a key component of every regulatory network. Predictions of transcription factor binding sites are usually based on models for transcription factor specificity. These models, in turn, are often based on examples of kno...

Descripción completa

Detalles Bibliográficos
Autores principales: Homsi, Dana S. F., Gupta, Vineet, Stormo, Gary D.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2726951/
https://www.ncbi.nlm.nih.gov/pubmed/19707584
http://dx.doi.org/10.1371/journal.pone.0006736
_version_ 1782170644423639040
author Homsi, Dana S. F.
Gupta, Vineet
Stormo, Gary D.
author_facet Homsi, Dana S. F.
Gupta, Vineet
Stormo, Gary D.
author_sort Homsi, Dana S. F.
collection PubMed
description BACKGROUND: The binding of transcription factors to their respective DNA sites is a key component of every regulatory network. Predictions of transcription factor binding sites are usually based on models for transcription factor specificity. These models, in turn, are often based on examples of known binding sites. METHODOLOGY/PRINCIPAL FINDINGS: Collections of binding sites are obtained in simulation experiments where the true model for the transcription factor is known and various sampling procedures are employed. We compare the accuracies of three different and commonly used methods for predicting the specificity of the transcription factor based on example binding sites. Different methods for constructing the models can lead to significant differences in the accuracy of the predictions and we show that commonly used methods can be positively misleading, even at large sample sizes and using noise-free data. Methods that minimize the number of predicted binding sequences are often significantly more accurate than the other methods tested. CONCLUSIONS/SIGNIFICANCE: Different methods for generating motifs from example binding sites can have significantly different numbers of false positive and false negative predictions. For many different sampling procedures models based on quadratic programming are the most accurate.
format Text
id pubmed-2726951
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27269512009-08-25 Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites Homsi, Dana S. F. Gupta, Vineet Stormo, Gary D. PLoS One Research Article BACKGROUND: The binding of transcription factors to their respective DNA sites is a key component of every regulatory network. Predictions of transcription factor binding sites are usually based on models for transcription factor specificity. These models, in turn, are often based on examples of known binding sites. METHODOLOGY/PRINCIPAL FINDINGS: Collections of binding sites are obtained in simulation experiments where the true model for the transcription factor is known and various sampling procedures are employed. We compare the accuracies of three different and commonly used methods for predicting the specificity of the transcription factor based on example binding sites. Different methods for constructing the models can lead to significant differences in the accuracy of the predictions and we show that commonly used methods can be positively misleading, even at large sample sizes and using noise-free data. Methods that minimize the number of predicted binding sequences are often significantly more accurate than the other methods tested. CONCLUSIONS/SIGNIFICANCE: Different methods for generating motifs from example binding sites can have significantly different numbers of false positive and false negative predictions. For many different sampling procedures models based on quadratic programming are the most accurate. Public Library of Science 2009-08-25 /pmc/articles/PMC2726951/ /pubmed/19707584 http://dx.doi.org/10.1371/journal.pone.0006736 Text en Homsi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Homsi, Dana S. F.
Gupta, Vineet
Stormo, Gary D.
Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
title Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
title_full Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
title_fullStr Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
title_full_unstemmed Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
title_short Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
title_sort modeling the quantitative specificity of dna-binding proteins from example binding sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2726951/
https://www.ncbi.nlm.nih.gov/pubmed/19707584
http://dx.doi.org/10.1371/journal.pone.0006736
work_keys_str_mv AT homsidanasf modelingthequantitativespecificityofdnabindingproteinsfromexamplebindingsites
AT guptavineet modelingthequantitativespecificityofdnabindingproteinsfromexamplebindingsites
AT stormogaryd modelingthequantitativespecificityofdnabindingproteinsfromexamplebindingsites