Cargando…
Stochastic EM-based TFBS motif discovery with MITSU
Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algor...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058950/ https://www.ncbi.nlm.nih.gov/pubmed/24931999 http://dx.doi.org/10.1093/bioinformatics/btu286 |
_version_ | 1782321192642805760 |
---|---|
author | Kilpatrick, Alastair M. Ward, Bruce Aitken, Stuart |
author_facet | Kilpatrick, Alastair M. Ward, Bruce Aitken, Stuart |
author_sort | Kilpatrick, Alastair M. |
collection | PubMed |
description | Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algorithm has been used to overcome some of the limitations of the EM algorithm; however, the application of sEM to motif discovery has not been fully explored. Results: We present MITSU (Motif discovery by ITerative Sampling and Updating), a novel algorithm for motif discovery, which combines sEM with an improved approximation to the likelihood function, which is unconstrained with regard to the distribution of motif occurrences within the input dataset. The algorithm is evaluated quantitatively on realistic synthetic data and several collections of characterized prokaryotic TFBS motifs and shown to outperform EM and an alternative sEM-based algorithm, particularly in terms of site-level positive predictive value. Availability and implementation: Java executable available for download at http://www.sourceforge.net/p/mitsu-motif/, supported on Linux/OS X. Contact: a.m.kilpatrick@sms.ed.ac.uk |
format | Online Article Text |
id | pubmed-4058950 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-40589502014-06-18 Stochastic EM-based TFBS motif discovery with MITSU Kilpatrick, Alastair M. Ward, Bruce Aitken, Stuart Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algorithm has been used to overcome some of the limitations of the EM algorithm; however, the application of sEM to motif discovery has not been fully explored. Results: We present MITSU (Motif discovery by ITerative Sampling and Updating), a novel algorithm for motif discovery, which combines sEM with an improved approximation to the likelihood function, which is unconstrained with regard to the distribution of motif occurrences within the input dataset. The algorithm is evaluated quantitatively on realistic synthetic data and several collections of characterized prokaryotic TFBS motifs and shown to outperform EM and an alternative sEM-based algorithm, particularly in terms of site-level positive predictive value. Availability and implementation: Java executable available for download at http://www.sourceforge.net/p/mitsu-motif/, supported on Linux/OS X. Contact: a.m.kilpatrick@sms.ed.ac.uk Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058950/ /pubmed/24931999 http://dx.doi.org/10.1093/bioinformatics/btu286 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2014 Proceedings Papers Committee Kilpatrick, Alastair M. Ward, Bruce Aitken, Stuart Stochastic EM-based TFBS motif discovery with MITSU |
title | Stochastic EM-based TFBS motif discovery with MITSU |
title_full | Stochastic EM-based TFBS motif discovery with MITSU |
title_fullStr | Stochastic EM-based TFBS motif discovery with MITSU |
title_full_unstemmed | Stochastic EM-based TFBS motif discovery with MITSU |
title_short | Stochastic EM-based TFBS motif discovery with MITSU |
title_sort | stochastic em-based tfbs motif discovery with mitsu |
topic | Ismb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058950/ https://www.ncbi.nlm.nih.gov/pubmed/24931999 http://dx.doi.org/10.1093/bioinformatics/btu286 |
work_keys_str_mv | AT kilpatrickalastairm stochasticembasedtfbsmotifdiscoverywithmitsu AT wardbruce stochasticembasedtfbsmotifdiscoverywithmitsu AT aitkenstuart stochasticembasedtfbsmotifdiscoverywithmitsu |