Cargando…

Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites

BACKGROUND: In studies of gene regulation the efficient computational detection of over-represented transcription factor binding sites is an increasingly important aspect. Several published methods can be used for testing whether a set of hypothesised co-regulated genes share a common regulatory reg...

Descripción completa

Detalles Bibliográficos
Autores principales: Marstrand, Troels T., Frellsen, Jes, Moltke, Ida, Thiim, Martin, Valen, Eivind, Retelska, Dorota, Krogh, Anders
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2229843/
https://www.ncbi.nlm.nih.gov/pubmed/18286180
http://dx.doi.org/10.1371/journal.pone.0001623
_version_ 1782150217392455680
author Marstrand, Troels T.
Frellsen, Jes
Moltke, Ida
Thiim, Martin
Valen, Eivind
Retelska, Dorota
Krogh, Anders
author_facet Marstrand, Troels T.
Frellsen, Jes
Moltke, Ida
Thiim, Martin
Valen, Eivind
Retelska, Dorota
Krogh, Anders
author_sort Marstrand, Troels T.
collection PubMed
description BACKGROUND: In studies of gene regulation the efficient computational detection of over-represented transcription factor binding sites is an increasingly important aspect. Several published methods can be used for testing whether a set of hypothesised co-regulated genes share a common regulatory regime based on the occurrence of the modelled transcription factor binding sites. However there is little or no information available for guiding the end users choice of method. Furthermore it would be necessary to obtain several different software programs from various sources to make a well-founded choice. METHODOLOGY: We introduce a software package, Asap, for fast searching with position weight matrices that include several standard methods for assessing over-representation. We have compared the ability of these methods to detect over-represented transcription factor binding sites in artificial promoter sequences. Controlling all aspects of our input data we are able to identify the optimal statistics across multiple threshold values and for sequence sets containing different distributions of transcription factor binding sites. CONCLUSIONS: We show that our implementation is significantly faster than more naïve scanning algorithms when searching with many weight matrices in large sequence sets. When comparing the various statistics, we show that those based on binomial over-representation and Fisher's exact test performs almost equally good and better than the others. An online server is available at http://servers.binf.ku.dk/asap/.
format Text
id pubmed-2229843
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-22298432008-02-20 Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites Marstrand, Troels T. Frellsen, Jes Moltke, Ida Thiim, Martin Valen, Eivind Retelska, Dorota Krogh, Anders PLoS One Research Article BACKGROUND: In studies of gene regulation the efficient computational detection of over-represented transcription factor binding sites is an increasingly important aspect. Several published methods can be used for testing whether a set of hypothesised co-regulated genes share a common regulatory regime based on the occurrence of the modelled transcription factor binding sites. However there is little or no information available for guiding the end users choice of method. Furthermore it would be necessary to obtain several different software programs from various sources to make a well-founded choice. METHODOLOGY: We introduce a software package, Asap, for fast searching with position weight matrices that include several standard methods for assessing over-representation. We have compared the ability of these methods to detect over-represented transcription factor binding sites in artificial promoter sequences. Controlling all aspects of our input data we are able to identify the optimal statistics across multiple threshold values and for sequence sets containing different distributions of transcription factor binding sites. CONCLUSIONS: We show that our implementation is significantly faster than more naïve scanning algorithms when searching with many weight matrices in large sequence sets. When comparing the various statistics, we show that those based on binomial over-representation and Fisher's exact test performs almost equally good and better than the others. An online server is available at http://servers.binf.ku.dk/asap/. Public Library of Science 2008-02-20 /pmc/articles/PMC2229843/ /pubmed/18286180 http://dx.doi.org/10.1371/journal.pone.0001623 Text en Marstrand et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Marstrand, Troels T.
Frellsen, Jes
Moltke, Ida
Thiim, Martin
Valen, Eivind
Retelska, Dorota
Krogh, Anders
Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites
title Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites
title_full Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites
title_fullStr Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites
title_full_unstemmed Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites
title_short Asap: A Framework for Over-Representation Statistics for Transcription Factor Binding Sites
title_sort asap: a framework for over-representation statistics for transcription factor binding sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2229843/
https://www.ncbi.nlm.nih.gov/pubmed/18286180
http://dx.doi.org/10.1371/journal.pone.0001623
work_keys_str_mv AT marstrandtroelst asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites
AT frellsenjes asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites
AT moltkeida asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites
AT thiimmartin asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites
AT valeneivind asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites
AT retelskadorota asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites
AT kroghanders asapaframeworkforoverrepresentationstatisticsfortranscriptionfactorbindingsites