Cargando…

A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average

Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new c...

Descripción completa

Detalles Bibliográficos
Autores principales: Komori, Osamu, Eguchi, Shinto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145026/
https://www.ncbi.nlm.nih.gov/pubmed/33923177
http://dx.doi.org/10.3390/e23050518
_version_ 1783697084157788160
author Komori, Osamu
Eguchi, Shinto
author_facet Komori, Osamu
Eguchi, Shinto
author_sort Komori, Osamu
collection PubMed
description Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities.
format Online
Article
Text
id pubmed-8145026
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81450262021-05-26 A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average Komori, Osamu Eguchi, Shinto Entropy (Basel) Article Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities. MDPI 2021-04-24 /pmc/articles/PMC8145026/ /pubmed/33923177 http://dx.doi.org/10.3390/e23050518 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Komori, Osamu
Eguchi, Shinto
A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
title A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
title_full A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
title_fullStr A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
title_full_unstemmed A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
title_short A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
title_sort unified formulation of k-means, fuzzy c-means and gaussian mixture model by the kolmogorov–nagumo average
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145026/
https://www.ncbi.nlm.nih.gov/pubmed/33923177
http://dx.doi.org/10.3390/e23050518
work_keys_str_mv AT komoriosamu aunifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage
AT eguchishinto aunifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage
AT komoriosamu unifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage
AT eguchishinto unifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage