Cargando…

Identifying Mixtures of Mixtures Using Bayesian Estimation

The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing p...

Descripción completa

Detalles Bibliográficos
Autores principales: Malsiner-Walli, Gertraud, Frühwirth-Schnatter, Sylvia, Grün, Bettina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Taylor & Francis 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5455957/
https://www.ncbi.nlm.nih.gov/pubmed/28626349
http://dx.doi.org/10.1080/10618600.2016.1200472
_version_ 1783241138937790464
author Malsiner-Walli, Gertraud
Frühwirth-Schnatter, Sylvia
Grün, Bettina
author_facet Malsiner-Walli, Gertraud
Frühwirth-Schnatter, Sylvia
Grün, Bettina
author_sort Malsiner-Walli, Gertraud
collection PubMed
description The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows us to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows us to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semiparametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark datasets. Supplementary materials for this article are available online.
format Online
Article
Text
id pubmed-5455957
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Taylor & Francis
record_format MEDLINE/PubMed
spelling pubmed-54559572017-06-15 Identifying Mixtures of Mixtures Using Bayesian Estimation Malsiner-Walli, Gertraud Frühwirth-Schnatter, Sylvia Grün, Bettina J Comput Graph Stat Bayesian Models The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows us to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows us to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semiparametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark datasets. Supplementary materials for this article are available online. Taylor & Francis 2017-04-03 2017-04-24 /pmc/articles/PMC5455957/ /pubmed/28626349 http://dx.doi.org/10.1080/10618600.2016.1200472 Text en © 2017 The Author(s). Published with license by Taylor & Francis http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Bayesian Models
Malsiner-Walli, Gertraud
Frühwirth-Schnatter, Sylvia
Grün, Bettina
Identifying Mixtures of Mixtures Using Bayesian Estimation
title Identifying Mixtures of Mixtures Using Bayesian Estimation
title_full Identifying Mixtures of Mixtures Using Bayesian Estimation
title_fullStr Identifying Mixtures of Mixtures Using Bayesian Estimation
title_full_unstemmed Identifying Mixtures of Mixtures Using Bayesian Estimation
title_short Identifying Mixtures of Mixtures Using Bayesian Estimation
title_sort identifying mixtures of mixtures using bayesian estimation
topic Bayesian Models
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5455957/
https://www.ncbi.nlm.nih.gov/pubmed/28626349
http://dx.doi.org/10.1080/10618600.2016.1200472
work_keys_str_mv AT malsinerwalligertraud identifyingmixturesofmixturesusingbayesianestimation
AT fruhwirthschnattersylvia identifyingmixturesofmixturesusingbayesianestimation
AT grunbettina identifyingmixturesofmixturesusingbayesianestimation