Cargando…

Statistical significance of cis-regulatory modules

BACKGROUND: It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection...

Descripción completa

Detalles Bibliográficos
Autores principales: Schones, Dustin E, Smith, Andrew D, Zhang, Michael Q
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1796902/
https://www.ncbi.nlm.nih.gov/pubmed/17241466
http://dx.doi.org/10.1186/1471-2105-8-19
_version_ 1782132269704544256
author Schones, Dustin E
Smith, Andrew D
Zhang, Michael Q
author_facet Schones, Dustin E
Smith, Andrew D
Zhang, Michael Q
author_sort Schones, Dustin E
collection PubMed
description BACKGROUND: It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. RESULTS: We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. CONCLUSION: The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM) and MODSTORM software.
format Text
id pubmed-1796902
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17969022007-02-16 Statistical significance of cis-regulatory modules Schones, Dustin E Smith, Andrew D Zhang, Michael Q BMC Bioinformatics Methodology Article BACKGROUND: It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. RESULTS: We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. CONCLUSION: The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM) and MODSTORM software. BioMed Central 2007-01-22 /pmc/articles/PMC1796902/ /pubmed/17241466 http://dx.doi.org/10.1186/1471-2105-8-19 Text en Copyright © 2007 Schones et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Schones, Dustin E
Smith, Andrew D
Zhang, Michael Q
Statistical significance of cis-regulatory modules
title Statistical significance of cis-regulatory modules
title_full Statistical significance of cis-regulatory modules
title_fullStr Statistical significance of cis-regulatory modules
title_full_unstemmed Statistical significance of cis-regulatory modules
title_short Statistical significance of cis-regulatory modules
title_sort statistical significance of cis-regulatory modules
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1796902/
https://www.ncbi.nlm.nih.gov/pubmed/17241466
http://dx.doi.org/10.1186/1471-2105-8-19
work_keys_str_mv AT schonesdustine statisticalsignificanceofcisregulatorymodules
AT smithandrewd statisticalsignificanceofcisregulatorymodules
AT zhangmichaelq statisticalsignificanceofcisregulatorymodules