Cargando…

An efficient method for statistical significance calculation of transcription factor binding sites

Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a c...

Descripción completa

Detalles Bibliográficos
Autores principales: Qian, Ziliang, Lu, Lingyi, Qi, Liu, Li, Yixue
Formato: Texto
Lenguaje:English
Publicado: Biomedical Informatics Publishing Group 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241927/
https://www.ncbi.nlm.nih.gov/pubmed/18305824
_version_ 1782150558452285440
author Qian, Ziliang
Lu, Lingyi
Qi, Liu
Li, Yixue
author_facet Qian, Ziliang
Lu, Lingyi
Qi, Liu
Li, Yixue
author_sort Qian, Ziliang
collection PubMed
description Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a critical role in identification. We developed an efficient algorithm to provide precise calculation of the statistical significance, remarkably enhancing the calculation efficiency by reducing the time complexity from an exponent scale to a linear scale, and successfully extended the application of this algorithm to a wide range of models, from the commonly used position weight matrix models to the complicated Bayesian Network models. Further, we calculated p-values of all transcription factor DNA binding sites recorded in the database, JASPAR, and based on these, we investigated some unseen properties of p-values as a whole, such as the p-value distribution of different models and the p-value variance according to changed scoring schemes. We hope that our algorithm and the result of computational experiments would offer an improved solution to the statistical significance of transcription factor binding sites. The software to implement our method can be downloaded from http://pcal.biosino.org/pCal.html.
format Text
id pubmed-2241927
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Biomedical Informatics Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-22419272008-02-27 An efficient method for statistical significance calculation of transcription factor binding sites Qian, Ziliang Lu, Lingyi Qi, Liu Li, Yixue Bioinformation Prediction Model Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a critical role in identification. We developed an efficient algorithm to provide precise calculation of the statistical significance, remarkably enhancing the calculation efficiency by reducing the time complexity from an exponent scale to a linear scale, and successfully extended the application of this algorithm to a wide range of models, from the commonly used position weight matrix models to the complicated Bayesian Network models. Further, we calculated p-values of all transcription factor DNA binding sites recorded in the database, JASPAR, and based on these, we investigated some unseen properties of p-values as a whole, such as the p-value distribution of different models and the p-value variance according to changed scoring schemes. We hope that our algorithm and the result of computational experiments would offer an improved solution to the statistical significance of transcription factor binding sites. The software to implement our method can be downloaded from http://pcal.biosino.org/pCal.html. Biomedical Informatics Publishing Group 2007-12-30 /pmc/articles/PMC2241927/ /pubmed/18305824 Text en © 2007 Biomedical Informatics Publishing Group This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Prediction Model
Qian, Ziliang
Lu, Lingyi
Qi, Liu
Li, Yixue
An efficient method for statistical significance calculation of transcription factor binding sites
title An efficient method for statistical significance calculation of transcription factor binding sites
title_full An efficient method for statistical significance calculation of transcription factor binding sites
title_fullStr An efficient method for statistical significance calculation of transcription factor binding sites
title_full_unstemmed An efficient method for statistical significance calculation of transcription factor binding sites
title_short An efficient method for statistical significance calculation of transcription factor binding sites
title_sort efficient method for statistical significance calculation of transcription factor binding sites
topic Prediction Model
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241927/
https://www.ncbi.nlm.nih.gov/pubmed/18305824
work_keys_str_mv AT qianziliang anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT lulingyi anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT qiliu anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT liyixue anefficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT qianziliang efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT lulingyi efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT qiliu efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites
AT liyixue efficientmethodforstatisticalsignificancecalculationoftranscriptionfactorbindingsites