Cargando…
An efficient method for statistical significance calculation of transcription factor binding sites
Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a c...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Biomedical Informatics Publishing Group
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241927/ https://www.ncbi.nlm.nih.gov/pubmed/18305824 |
_version_ | 1782150558452285440 |
---|---|
author | Qian, Ziliang Lu, Lingyi Qi, Liu Li, Yixue |
author_facet | Qian, Ziliang Lu, Lingyi Qi, Liu Li, Yixue |
author_sort | Qian, Ziliang |
collection | PubMed |
description | Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a critical role in identification. We developed an efficient algorithm to provide precise calculation of the statistical significance, remarkably enhancing the calculation efficiency by reducing the time complexity from an exponent scale to a linear scale, and successfully extended the application of this algorithm to a wide range of models, from the commonly used position weight matrix models to the complicated Bayesian Network models. Further, we calculated p-values of all transcription factor DNA binding sites recorded in the database, JASPAR, and based on these, we investigated some unseen properties of p-values as a whole, such as the p-value distribution of different models and the p-value variance according to changed scoring schemes. We hope that our algorithm and the result of computational experiments would offer an improved solution to the statistical significance of transcription factor binding sites. The software to implement our method can be downloaded from http://pcal.biosino.org/pCal.html. |
format | Text |
id | pubmed-2241927 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | Biomedical Informatics Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-22419272008-02-27 An efficient method for statistical significance calculation of transcription factor binding sites Qian, Ziliang Lu, Lingyi Qi, Liu Li, Yixue Bioinformation Prediction Model Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a critical role in identification. We developed an efficient algorithm to provide precise calculation of the statistical significance, remarkably enhancing the calculation efficiency by reducing the time complexity from an exponent scale to a linear scale, and successfully extended the application of this algorithm to a wide range of models, from the commonly used position weight matrix models to the complicated Bayesian Network models. Further, we calculated p-values of all transcription factor DNA binding sites recorded in the database, JASPAR, and based on these, we investigated some unseen properties of p-values as a whole, such as the p-value distribution of different models and the p-value variance according to changed scoring schemes. We hope that our algorithm and the result of computational experiments would offer an improved solution to the statistical significance of transcription factor binding sites. The software to implement our method can be downloaded from http://pcal.biosino.org/pCal.html. Biomedical Informatics Publishing Group 2007-12-30 /pmc/articles/PMC2241927/ /pubmed/18305824 Text en © 2007 Biomedical Informatics Publishing Group This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited. |
spellingShingle | Prediction Model Qian, Ziliang Lu, Lingyi Qi, Liu Li, Yixue An efficient method for statistical significance calculation of transcription factor binding sites |
title | An efficient method for statistical significance calculation of transcription factor binding sites |
title_full | An efficient method for statistical significance calculation of transcription factor binding sites |
title_fullStr | An efficient method for statistical significance calculation of transcription factor binding sites |
title_full_unstemmed | An efficient method for statistical significance calculation of transcription factor binding sites |
title_short | An efficient method for statistical significance calculation of transcription factor binding sites |
title_sort | efficient method for statistical significance calculation of transcription factor binding sites |
topic | Prediction Model |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241927/ https://www.ncbi.nlm.nih.gov/pubmed/18305824 |
work_keys_str_mv | AT qianziliang anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT lulingyi anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT qiliu anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT liyixue anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT qianziliang efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT lulingyi efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT qiliu efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites AT liyixue efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites |