Cargando…
Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites
BACKGROUND: The binding of transcription factors to their respective DNA sites is a key component of every regulatory network. Predictions of transcription factor binding sites are usually based on models for transcription factor specificity. These models, in turn, are often based on examples of kno...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2726951/ https://www.ncbi.nlm.nih.gov/pubmed/19707584 http://dx.doi.org/10.1371/journal.pone.0006736 |
_version_ | 1782170644423639040 |
---|---|
author | Homsi, Dana S. F. Gupta, Vineet Stormo, Gary D. |
author_facet | Homsi, Dana S. F. Gupta, Vineet Stormo, Gary D. |
author_sort | Homsi, Dana S. F. |
collection | PubMed |
description | BACKGROUND: The binding of transcription factors to their respective DNA sites is a key component of every regulatory network. Predictions of transcription factor binding sites are usually based on models for transcription factor specificity. These models, in turn, are often based on examples of known binding sites. METHODOLOGY/PRINCIPAL FINDINGS: Collections of binding sites are obtained in simulation experiments where the true model for the transcription factor is known and various sampling procedures are employed. We compare the accuracies of three different and commonly used methods for predicting the specificity of the transcription factor based on example binding sites. Different methods for constructing the models can lead to significant differences in the accuracy of the predictions and we show that commonly used methods can be positively misleading, even at large sample sizes and using noise-free data. Methods that minimize the number of predicted binding sequences are often significantly more accurate than the other methods tested. CONCLUSIONS/SIGNIFICANCE: Different methods for generating motifs from example binding sites can have significantly different numbers of false positive and false negative predictions. For many different sampling procedures models based on quadratic programming are the most accurate. |
format | Text |
id | pubmed-2726951 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-27269512009-08-25 Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites Homsi, Dana S. F. Gupta, Vineet Stormo, Gary D. PLoS One Research Article BACKGROUND: The binding of transcription factors to their respective DNA sites is a key component of every regulatory network. Predictions of transcription factor binding sites are usually based on models for transcription factor specificity. These models, in turn, are often based on examples of known binding sites. METHODOLOGY/PRINCIPAL FINDINGS: Collections of binding sites are obtained in simulation experiments where the true model for the transcription factor is known and various sampling procedures are employed. We compare the accuracies of three different and commonly used methods for predicting the specificity of the transcription factor based on example binding sites. Different methods for constructing the models can lead to significant differences in the accuracy of the predictions and we show that commonly used methods can be positively misleading, even at large sample sizes and using noise-free data. Methods that minimize the number of predicted binding sequences are often significantly more accurate than the other methods tested. CONCLUSIONS/SIGNIFICANCE: Different methods for generating motifs from example binding sites can have significantly different numbers of false positive and false negative predictions. For many different sampling procedures models based on quadratic programming are the most accurate. Public Library of Science 2009-08-25 /pmc/articles/PMC2726951/ /pubmed/19707584 http://dx.doi.org/10.1371/journal.pone.0006736 Text en Homsi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Homsi, Dana S. F. Gupta, Vineet Stormo, Gary D. Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites |
title | Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites |
title_full | Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites |
title_fullStr | Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites |
title_full_unstemmed | Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites |
title_short | Modeling the Quantitative Specificity of DNA-Binding Proteins from Example Binding Sites |
title_sort | modeling the quantitative specificity of dna-binding proteins from example binding sites |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2726951/ https://www.ncbi.nlm.nih.gov/pubmed/19707584 http://dx.doi.org/10.1371/journal.pone.0006736 |
work_keys_str_mv | AT homsidanasf modelingthequantitativespecificityofdnabindingproteinsfromexamplebindingsites AT guptavineet modelingthequantitativespecificityofdnabindingproteinsfromexamplebindingsites AT stormogaryd modelingthequantitativespecificityofdnabindingproteinsfromexamplebindingsites |