Cargando…
Finding regulatory elements and regulatory motifs: a general probabilistic framework
Over the last two decades a large number of algorithms has been developed for regulatory motif finding. Here we show how many of these algorithms, especially those that model binding specificities of regulatory factors with position specific weight matrices (WMs), naturally arise within a general Ba...
Autor principal: | |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1995539/ https://www.ncbi.nlm.nih.gov/pubmed/17903285 http://dx.doi.org/10.1186/1471-2105-8-S6-S4 |
_version_ | 1782135519877005312 |
---|---|
author | van Nimwegen, Erik |
author_facet | van Nimwegen, Erik |
author_sort | van Nimwegen, Erik |
collection | PubMed |
description | Over the last two decades a large number of algorithms has been developed for regulatory motif finding. Here we show how many of these algorithms, especially those that model binding specificities of regulatory factors with position specific weight matrices (WMs), naturally arise within a general Bayesian probabilistic framework. We discuss how WMs are constructed from sets of regulatory sites, how sites for a given WM can be discovered by scanning of large sequences, how to cluster WMs, and more generally how to cluster large sets of sites from different WMs into clusters. We discuss how 'regulatory modules', clusters of sites for subsets of WMs, can be found in large intergenic sequences, and we discuss different methods for ab initio motif finding, including expectation maximization (EM) algorithms, and motif sampling algorithms. Finally, we extensively discuss how module finding methods and ab initio motif finding methods can be extended to take phylogenetic relations between the input sequences into account, i.e. we show how motif finding and phylogenetic footprinting can be integrated in a rigorous probabilistic framework. The article is intended for readers with a solid background in applied mathematics, and preferably with some knowledge of general Bayesian probabilistic methods. The main purpose of the article is to elucidate that all these methods are not a disconnected set of individual algorithmic recipes, but that they are just different facets of a single integrated probabilistic theory. |
format | Text |
id | pubmed-1995539 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-19955392007-10-02 Finding regulatory elements and regulatory motifs: a general probabilistic framework van Nimwegen, Erik BMC Bioinformatics Review Over the last two decades a large number of algorithms has been developed for regulatory motif finding. Here we show how many of these algorithms, especially those that model binding specificities of regulatory factors with position specific weight matrices (WMs), naturally arise within a general Bayesian probabilistic framework. We discuss how WMs are constructed from sets of regulatory sites, how sites for a given WM can be discovered by scanning of large sequences, how to cluster WMs, and more generally how to cluster large sets of sites from different WMs into clusters. We discuss how 'regulatory modules', clusters of sites for subsets of WMs, can be found in large intergenic sequences, and we discuss different methods for ab initio motif finding, including expectation maximization (EM) algorithms, and motif sampling algorithms. Finally, we extensively discuss how module finding methods and ab initio motif finding methods can be extended to take phylogenetic relations between the input sequences into account, i.e. we show how motif finding and phylogenetic footprinting can be integrated in a rigorous probabilistic framework. The article is intended for readers with a solid background in applied mathematics, and preferably with some knowledge of general Bayesian probabilistic methods. The main purpose of the article is to elucidate that all these methods are not a disconnected set of individual algorithmic recipes, but that they are just different facets of a single integrated probabilistic theory. BioMed Central 2007-09-27 /pmc/articles/PMC1995539/ /pubmed/17903285 http://dx.doi.org/10.1186/1471-2105-8-S6-S4 Text en Copyright © 2007 van Nimwegen; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Review van Nimwegen, Erik Finding regulatory elements and regulatory motifs: a general probabilistic framework |
title | Finding regulatory elements and regulatory motifs: a general probabilistic framework |
title_full | Finding regulatory elements and regulatory motifs: a general probabilistic framework |
title_fullStr | Finding regulatory elements and regulatory motifs: a general probabilistic framework |
title_full_unstemmed | Finding regulatory elements and regulatory motifs: a general probabilistic framework |
title_short | Finding regulatory elements and regulatory motifs: a general probabilistic framework |
title_sort | finding regulatory elements and regulatory motifs: a general probabilistic framework |
topic | Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1995539/ https://www.ncbi.nlm.nih.gov/pubmed/17903285 http://dx.doi.org/10.1186/1471-2105-8-S6-S4 |
work_keys_str_mv | AT vannimwegenerik findingregulatoryelementsandregulatorymotifsageneralprobabilisticframework |