Cargando…

Stochastic EM-based TFBS motif discovery with MITSU

Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algor...

Descripción completa

Detalles Bibliográficos
Autores principales: Kilpatrick, Alastair M., Ward, Bruce, Aitken, Stuart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058950/
https://www.ncbi.nlm.nih.gov/pubmed/24931999
http://dx.doi.org/10.1093/bioinformatics/btu286
_version_ 1782321192642805760
author Kilpatrick, Alastair M.
Ward, Bruce
Aitken, Stuart
author_facet Kilpatrick, Alastair M.
Ward, Bruce
Aitken, Stuart
author_sort Kilpatrick, Alastair M.
collection PubMed
description Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algorithm has been used to overcome some of the limitations of the EM algorithm; however, the application of sEM to motif discovery has not been fully explored. Results: We present MITSU (Motif discovery by ITerative Sampling and Updating), a novel algorithm for motif discovery, which combines sEM with an improved approximation to the likelihood function, which is unconstrained with regard to the distribution of motif occurrences within the input dataset. The algorithm is evaluated quantitatively on realistic synthetic data and several collections of characterized prokaryotic TFBS motifs and shown to outperform EM and an alternative sEM-based algorithm, particularly in terms of site-level positive predictive value. Availability and implementation: Java executable available for download at http://www.sourceforge.net/p/mitsu-motif/, supported on Linux/OS X. Contact: a.m.kilpatrick@sms.ed.ac.uk
format Online
Article
Text
id pubmed-4058950
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40589502014-06-18 Stochastic EM-based TFBS motif discovery with MITSU Kilpatrick, Alastair M. Ward, Bruce Aitken, Stuart Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algorithm has been used to overcome some of the limitations of the EM algorithm; however, the application of sEM to motif discovery has not been fully explored. Results: We present MITSU (Motif discovery by ITerative Sampling and Updating), a novel algorithm for motif discovery, which combines sEM with an improved approximation to the likelihood function, which is unconstrained with regard to the distribution of motif occurrences within the input dataset. The algorithm is evaluated quantitatively on realistic synthetic data and several collections of characterized prokaryotic TFBS motifs and shown to outperform EM and an alternative sEM-based algorithm, particularly in terms of site-level positive predictive value. Availability and implementation: Java executable available for download at http://www.sourceforge.net/p/mitsu-motif/, supported on Linux/OS X. Contact: a.m.kilpatrick@sms.ed.ac.uk Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058950/ /pubmed/24931999 http://dx.doi.org/10.1093/bioinformatics/btu286 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2014 Proceedings Papers Committee
Kilpatrick, Alastair M.
Ward, Bruce
Aitken, Stuart
Stochastic EM-based TFBS motif discovery with MITSU
title Stochastic EM-based TFBS motif discovery with MITSU
title_full Stochastic EM-based TFBS motif discovery with MITSU
title_fullStr Stochastic EM-based TFBS motif discovery with MITSU
title_full_unstemmed Stochastic EM-based TFBS motif discovery with MITSU
title_short Stochastic EM-based TFBS motif discovery with MITSU
title_sort stochastic em-based tfbs motif discovery with mitsu
topic Ismb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058950/
https://www.ncbi.nlm.nih.gov/pubmed/24931999
http://dx.doi.org/10.1093/bioinformatics/btu286
work_keys_str_mv AT kilpatrickalastairm stochasticembasedtfbsmotifdiscoverywithmitsu
AT wardbruce stochasticembasedtfbsmotifdiscoverywithmitsu
AT aitkenstuart stochasticembasedtfbsmotifdiscoverywithmitsu