Cargando…

Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory

BACKGROUND: The Baum-Welch learning procedure for Hidden Markov Models (HMMs) provides a powerful tool for tailoring HMM topologies to data for use in knowledge discovery and clustering. A linear memory procedure recently proposed by Miklós, I. and Meyer, I.M. describes a memory sparse version of th...

Descripción completa

Detalles Bibliográficos
Autores principales: Churbanov, Alexander, Winters-Hilt, Stephen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2430973/
https://www.ncbi.nlm.nih.gov/pubmed/18447951
http://dx.doi.org/10.1186/1471-2105-9-224
_version_ 1782156437312503808
author Churbanov, Alexander
Winters-Hilt, Stephen
author_facet Churbanov, Alexander
Winters-Hilt, Stephen
author_sort Churbanov, Alexander
collection PubMed
description BACKGROUND: The Baum-Welch learning procedure for Hidden Markov Models (HMMs) provides a powerful tool for tailoring HMM topologies to data for use in knowledge discovery and clustering. A linear memory procedure recently proposed by Miklós, I. and Meyer, I.M. describes a memory sparse version of the Baum-Welch algorithm with modifications to the original probabilistic table topologies to make memory use independent of sequence length (and linearly dependent on state number). The original description of the technique has some errors that we amend. We then compare the corrected implementation on a variety of data sets with conventional and checkpointing implementations. RESULTS: We provide a correct recurrence relation for the emission parameter estimate and extend it to parameter estimates of the Normal distribution. To accelerate estimation of the prior state probabilities, and decrease memory use, we reverse the originally proposed forward sweep. We describe different scaling strategies necessary in all real implementations of the algorithm to prevent underflow. In this paper we also describe our approach to a linear memory implementation of the Viterbi decoding algorithm (with linearity in the sequence length, while memory use is approximately independent of state number). We demonstrate the use of the linear memory implementation on an extended Duration Hidden Markov Model (DHMM) and on an HMM with a spike detection topology. Comparing the various implementations of the Baum-Welch procedure we find that the checkpointing algorithm produces the best overall tradeoff between memory use and speed. In cases where sequence length is very large (for Baum-Welch), or state number is very large (for Viterbi), the linear memory methods outlined may offer some utility. CONCLUSION: Our performance-optimized Java implementations of Baum-Welch algorithm are available at . The described method and implementations will aid sequence alignment, gene structure prediction, HMM profile training, nanopore ionic flow blockades analysis and many other domains that require efficient HMM training with EM.
format Text
id pubmed-2430973
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24309732008-06-19 Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory Churbanov, Alexander Winters-Hilt, Stephen BMC Bioinformatics Methodology Article BACKGROUND: The Baum-Welch learning procedure for Hidden Markov Models (HMMs) provides a powerful tool for tailoring HMM topologies to data for use in knowledge discovery and clustering. A linear memory procedure recently proposed by Miklós, I. and Meyer, I.M. describes a memory sparse version of the Baum-Welch algorithm with modifications to the original probabilistic table topologies to make memory use independent of sequence length (and linearly dependent on state number). The original description of the technique has some errors that we amend. We then compare the corrected implementation on a variety of data sets with conventional and checkpointing implementations. RESULTS: We provide a correct recurrence relation for the emission parameter estimate and extend it to parameter estimates of the Normal distribution. To accelerate estimation of the prior state probabilities, and decrease memory use, we reverse the originally proposed forward sweep. We describe different scaling strategies necessary in all real implementations of the algorithm to prevent underflow. In this paper we also describe our approach to a linear memory implementation of the Viterbi decoding algorithm (with linearity in the sequence length, while memory use is approximately independent of state number). We demonstrate the use of the linear memory implementation on an extended Duration Hidden Markov Model (DHMM) and on an HMM with a spike detection topology. Comparing the various implementations of the Baum-Welch procedure we find that the checkpointing algorithm produces the best overall tradeoff between memory use and speed. In cases where sequence length is very large (for Baum-Welch), or state number is very large (for Viterbi), the linear memory methods outlined may offer some utility. CONCLUSION: Our performance-optimized Java implementations of Baum-Welch algorithm are available at . The described method and implementations will aid sequence alignment, gene structure prediction, HMM profile training, nanopore ionic flow blockades analysis and many other domains that require efficient HMM training with EM. BioMed Central 2008-04-30 /pmc/articles/PMC2430973/ /pubmed/18447951 http://dx.doi.org/10.1186/1471-2105-9-224 Text en Copyright © 2008 Churbanov and Winters-Hilt; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Churbanov, Alexander
Winters-Hilt, Stephen
Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
title Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
title_full Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
title_fullStr Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
title_full_unstemmed Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
title_short Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
title_sort implementing em and viterbi algorithms for hidden markov model in linear memory
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2430973/
https://www.ncbi.nlm.nih.gov/pubmed/18447951
http://dx.doi.org/10.1186/1471-2105-9-224
work_keys_str_mv AT churbanovalexander implementingemandviterbialgorithmsforhiddenmarkovmodelinlinearmemory
AT wintershiltstephen implementingemandviterbialgorithmsforhiddenmarkovmodelinlinearmemory