Cargando…

Compressed computations using wavelets for hidden Markov models with continuous observations

Compression as an accelerant of computation is increasingly recognized as an important component in engineering fast real-world machine learning methods for big data; c.f., its impact on genome-scale approximate string matching. Previous work showed that compression can accelerate algorithms for Hid...

Descripción completa

Detalles Bibliográficos
Autores principales: Bello, Luca, Wiedenhöft, John, Schliep, Alexander
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243634/
https://www.ncbi.nlm.nih.gov/pubmed/37279196
http://dx.doi.org/10.1371/journal.pone.0286074
_version_ 1785054463800639488
author Bello, Luca
Wiedenhöft, John
Schliep, Alexander
author_facet Bello, Luca
Wiedenhöft, John
Schliep, Alexander
author_sort Bello, Luca
collection PubMed
description Compression as an accelerant of computation is increasingly recognized as an important component in engineering fast real-world machine learning methods for big data; c.f., its impact on genome-scale approximate string matching. Previous work showed that compression can accelerate algorithms for Hidden Markov Models (HMM) with discrete observations, both for the classical frequentist HMM algorithms—Forward Filtering, Backward Smoothing and Viterbi—and Gibbs sampling for Bayesian HMM. For Bayesian HMM with continuous-valued observations, compression was shown to greatly accelerate computations for specific types of data. For instance, data from large-scale experiments interrogating structural genetic variation can be assumed to be piece-wise constant with noise, or, equivalently, data generated by HMM with dominant self-transition probabilities. Here we extend the compressive computation approach to the classical frequentist HMM algorithms on continuous-valued observations, providing the first compressive approach for this problem. In a large-scale simulation study, we demonstrate empirically that in many settings compressed HMM algorithms very clearly outperform the classical algorithms with no, or only an insignificant effect, on the computed probabilities and infered state paths of maximal likelihood. This provides an efficient approach to big data computations with HMM. An open-source implementation of the method is available from https://github.com/lucabello/wavelet-hmms.
format Online
Article
Text
id pubmed-10243634
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-102436342023-06-07 Compressed computations using wavelets for hidden Markov models with continuous observations Bello, Luca Wiedenhöft, John Schliep, Alexander PLoS One Research Article Compression as an accelerant of computation is increasingly recognized as an important component in engineering fast real-world machine learning methods for big data; c.f., its impact on genome-scale approximate string matching. Previous work showed that compression can accelerate algorithms for Hidden Markov Models (HMM) with discrete observations, both for the classical frequentist HMM algorithms—Forward Filtering, Backward Smoothing and Viterbi—and Gibbs sampling for Bayesian HMM. For Bayesian HMM with continuous-valued observations, compression was shown to greatly accelerate computations for specific types of data. For instance, data from large-scale experiments interrogating structural genetic variation can be assumed to be piece-wise constant with noise, or, equivalently, data generated by HMM with dominant self-transition probabilities. Here we extend the compressive computation approach to the classical frequentist HMM algorithms on continuous-valued observations, providing the first compressive approach for this problem. In a large-scale simulation study, we demonstrate empirically that in many settings compressed HMM algorithms very clearly outperform the classical algorithms with no, or only an insignificant effect, on the computed probabilities and infered state paths of maximal likelihood. This provides an efficient approach to big data computations with HMM. An open-source implementation of the method is available from https://github.com/lucabello/wavelet-hmms. Public Library of Science 2023-06-06 /pmc/articles/PMC10243634/ /pubmed/37279196 http://dx.doi.org/10.1371/journal.pone.0286074 Text en © 2023 Bello et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bello, Luca
Wiedenhöft, John
Schliep, Alexander
Compressed computations using wavelets for hidden Markov models with continuous observations
title Compressed computations using wavelets for hidden Markov models with continuous observations
title_full Compressed computations using wavelets for hidden Markov models with continuous observations
title_fullStr Compressed computations using wavelets for hidden Markov models with continuous observations
title_full_unstemmed Compressed computations using wavelets for hidden Markov models with continuous observations
title_short Compressed computations using wavelets for hidden Markov models with continuous observations
title_sort compressed computations using wavelets for hidden markov models with continuous observations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243634/
https://www.ncbi.nlm.nih.gov/pubmed/37279196
http://dx.doi.org/10.1371/journal.pone.0286074
work_keys_str_mv AT belloluca compressedcomputationsusingwaveletsforhiddenmarkovmodelswithcontinuousobservations
AT wiedenhoftjohn compressedcomputationsusingwaveletsforhiddenmarkovmodelswithcontinuousobservations
AT schliepalexander compressedcomputationsusingwaveletsforhiddenmarkovmodelswithcontinuousobservations