Cargando…

Assessing the Effects of Symmetry on Motif Discovery and Modeling

BACKGROUND: Identifying the DNA binding sites for transcription factors is a key task in modeling the gene regulatory network of a cell. Predicting DNA binding sites computationally suffers from high false positives and false negatives due to various contributing factors, including the inaccurate mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Motlhabi, Lala M., Stormo, Gary D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3176789/
https://www.ncbi.nlm.nih.gov/pubmed/21949783
http://dx.doi.org/10.1371/journal.pone.0024908
_version_ 1782212251305902080
author Motlhabi, Lala M.
Stormo, Gary D.
author_facet Motlhabi, Lala M.
Stormo, Gary D.
author_sort Motlhabi, Lala M.
collection PubMed
description BACKGROUND: Identifying the DNA binding sites for transcription factors is a key task in modeling the gene regulatory network of a cell. Predicting DNA binding sites computationally suffers from high false positives and false negatives due to various contributing factors, including the inaccurate models for transcription factor specificity. One source of inaccuracy in the specificity models is the assumption of asymmetry for symmetric models. METHODOLOGY/PRINCIPAL FINDINGS: Using simulation studies, so that the correct binding site model is known and various parameters of the process can be systematically controlled, we test different motif finding algorithms on both symmetric and asymmetric binding site data. We show that if the true binding site is asymmetric the results are unambiguous and the asymmetric model is clearly superior to the symmetric model. But if the true binding specificity is symmetric commonly used methods can infer, incorrectly, that the motif is asymmetric. The resulting inaccurate motifs lead to lower sensitivity and specificity than would the correct, symmetric models. We also show how the correct model can be obtained by the use of appropriate measures of statistical significance. CONCLUSIONS/SIGNIFICANCE: This study demonstrates that the most commonly used motif-finding approaches usually model symmetric motifs incorrectly, which leads to higher than necessary false prediction errors. It also demonstrates how alternative motif-finding methods can correct the problem, providing more accurate motif models and reducing the errors. Furthermore, it provides criteria for determining whether a symmetric or asymmetric model is the most appropriate for any experimental dataset.
format Online
Article
Text
id pubmed-3176789
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31767892011-09-26 Assessing the Effects of Symmetry on Motif Discovery and Modeling Motlhabi, Lala M. Stormo, Gary D. PLoS One Research Article BACKGROUND: Identifying the DNA binding sites for transcription factors is a key task in modeling the gene regulatory network of a cell. Predicting DNA binding sites computationally suffers from high false positives and false negatives due to various contributing factors, including the inaccurate models for transcription factor specificity. One source of inaccuracy in the specificity models is the assumption of asymmetry for symmetric models. METHODOLOGY/PRINCIPAL FINDINGS: Using simulation studies, so that the correct binding site model is known and various parameters of the process can be systematically controlled, we test different motif finding algorithms on both symmetric and asymmetric binding site data. We show that if the true binding site is asymmetric the results are unambiguous and the asymmetric model is clearly superior to the symmetric model. But if the true binding specificity is symmetric commonly used methods can infer, incorrectly, that the motif is asymmetric. The resulting inaccurate motifs lead to lower sensitivity and specificity than would the correct, symmetric models. We also show how the correct model can be obtained by the use of appropriate measures of statistical significance. CONCLUSIONS/SIGNIFICANCE: This study demonstrates that the most commonly used motif-finding approaches usually model symmetric motifs incorrectly, which leads to higher than necessary false prediction errors. It also demonstrates how alternative motif-finding methods can correct the problem, providing more accurate motif models and reducing the errors. Furthermore, it provides criteria for determining whether a symmetric or asymmetric model is the most appropriate for any experimental dataset. Public Library of Science 2011-09-20 /pmc/articles/PMC3176789/ /pubmed/21949783 http://dx.doi.org/10.1371/journal.pone.0024908 Text en Motlhabi, Stormo. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Motlhabi, Lala M.
Stormo, Gary D.
Assessing the Effects of Symmetry on Motif Discovery and Modeling
title Assessing the Effects of Symmetry on Motif Discovery and Modeling
title_full Assessing the Effects of Symmetry on Motif Discovery and Modeling
title_fullStr Assessing the Effects of Symmetry on Motif Discovery and Modeling
title_full_unstemmed Assessing the Effects of Symmetry on Motif Discovery and Modeling
title_short Assessing the Effects of Symmetry on Motif Discovery and Modeling
title_sort assessing the effects of symmetry on motif discovery and modeling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3176789/
https://www.ncbi.nlm.nih.gov/pubmed/21949783
http://dx.doi.org/10.1371/journal.pone.0024908
work_keys_str_mv AT motlhabilalam assessingtheeffectsofsymmetryonmotifdiscoveryandmodeling
AT stormogaryd assessingtheeffectsofsymmetryonmotifdiscoveryandmodeling