Cargando…

BELMM: Bayesian model selection and random walk smoothing in time-series clustering

MOTIVATION: Due to advances in measuring technology, many new phenotype, gene expression, and other omics time-course datasets are now commonly available. Cluster analysis may provide useful information about the structure of such data. RESULTS: In this work, we propose BELMM (Bayesian Estimation of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sarala, Olli, Pyhäjärvi, Tanja, Sillanpää, Mikko J
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10686958/ https://www.ncbi.nlm.nih.gov/pubmed/37963057 http://dx.doi.org/10.1093/bioinformatics/btad686

_version_	1785151877111873536
author	Sarala, Olli Pyhäjärvi, Tanja Sillanpää, Mikko J
author_facet	Sarala, Olli Pyhäjärvi, Tanja Sillanpää, Mikko J
author_sort	Sarala, Olli
collection	PubMed
description	MOTIVATION: Due to advances in measuring technology, many new phenotype, gene expression, and other omics time-course datasets are now commonly available. Cluster analysis may provide useful information about the structure of such data. RESULTS: In this work, we propose BELMM (Bayesian Estimation of Latent Mixture Models): a flexible framework for analysing, clustering, and modelling time-series data in a Bayesian setting. The framework is built on mixture modelling: first, the mean curves of the mixture components are assumed to follow random walk smoothing priors. Second, we choose the most plausible model and the number of mixture components using the Reversible-jump Markov chain Monte Carlo. Last, we assign the individual time series into clusters based on the similarity to the cluster-specific trend curves determined by the latent random walk processes. We demonstrate the use of fast and slow implementations of our approach on both simulated and real time-series data using widely available software R, Stan, and CU-MSDSp. AVAILABILITY AND IMPLEMENTATION: The French mortality dataset is available at http://www.mortality.org, the Drosophila melanogaster embryogenesis gene expression data at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121160. Details on our simulated datasets are available in the Supplementary Material, and R scripts and a detailed tutorial on GitHub at https://github.com/ollisa/BELMM. The software CU-MSDSp is available on GitHub at https://github.com/jtchavisIII/CU-MSDSp.
format	Online Article Text
id	pubmed-10686958
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-106869582023-11-30 BELMM: Bayesian model selection and random walk smoothing in time-series clustering Sarala, Olli Pyhäjärvi, Tanja Sillanpää, Mikko J Bioinformatics Original Paper MOTIVATION: Due to advances in measuring technology, many new phenotype, gene expression, and other omics time-course datasets are now commonly available. Cluster analysis may provide useful information about the structure of such data. RESULTS: In this work, we propose BELMM (Bayesian Estimation of Latent Mixture Models): a flexible framework for analysing, clustering, and modelling time-series data in a Bayesian setting. The framework is built on mixture modelling: first, the mean curves of the mixture components are assumed to follow random walk smoothing priors. Second, we choose the most plausible model and the number of mixture components using the Reversible-jump Markov chain Monte Carlo. Last, we assign the individual time series into clusters based on the similarity to the cluster-specific trend curves determined by the latent random walk processes. We demonstrate the use of fast and slow implementations of our approach on both simulated and real time-series data using widely available software R, Stan, and CU-MSDSp. AVAILABILITY AND IMPLEMENTATION: The French mortality dataset is available at http://www.mortality.org, the Drosophila melanogaster embryogenesis gene expression data at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121160. Details on our simulated datasets are available in the Supplementary Material, and R scripts and a detailed tutorial on GitHub at https://github.com/ollisa/BELMM. The software CU-MSDSp is available on GitHub at https://github.com/jtchavisIII/CU-MSDSp. Oxford University Press 2023-11-14 /pmc/articles/PMC10686958/ /pubmed/37963057 http://dx.doi.org/10.1093/bioinformatics/btad686 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Paper Sarala, Olli Pyhäjärvi, Tanja Sillanpää, Mikko J BELMM: Bayesian model selection and random walk smoothing in time-series clustering
title	BELMM: Bayesian model selection and random walk smoothing in time-series clustering
title_full	BELMM: Bayesian model selection and random walk smoothing in time-series clustering
title_fullStr	BELMM: Bayesian model selection and random walk smoothing in time-series clustering
title_full_unstemmed	BELMM: Bayesian model selection and random walk smoothing in time-series clustering
title_short	BELMM: Bayesian model selection and random walk smoothing in time-series clustering
title_sort	belmm: bayesian model selection and random walk smoothing in time-series clustering
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10686958/ https://www.ncbi.nlm.nih.gov/pubmed/37963057 http://dx.doi.org/10.1093/bioinformatics/btad686
work_keys_str_mv	AT saralaolli belmmbayesianmodelselectionandrandomwalksmoothingintimeseriesclustering AT pyhajarvitanja belmmbayesianmodelselectionandrandomwalksmoothingintimeseriesclustering AT sillanpaamikkoj belmmbayesianmodelselectionandrandomwalksmoothingintimeseriesclustering

BELMM: Bayesian model selection and random walk smoothing in time-series clustering

Ejemplares similares