Cargando…
A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new c...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145026/ https://www.ncbi.nlm.nih.gov/pubmed/33923177 http://dx.doi.org/10.3390/e23050518 |
_version_ | 1783697084157788160 |
---|---|
author | Komori, Osamu Eguchi, Shinto |
author_facet | Komori, Osamu Eguchi, Shinto |
author_sort | Komori, Osamu |
collection | PubMed |
description | Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities. |
format | Online Article Text |
id | pubmed-8145026 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-81450262021-05-26 A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average Komori, Osamu Eguchi, Shinto Entropy (Basel) Article Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities. MDPI 2021-04-24 /pmc/articles/PMC8145026/ /pubmed/33923177 http://dx.doi.org/10.3390/e23050518 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Komori, Osamu Eguchi, Shinto A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
title | A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
title_full | A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
title_fullStr | A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
title_full_unstemmed | A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
title_short | A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
title_sort | unified formulation of k-means, fuzzy c-means and gaussian mixture model by the kolmogorov–nagumo average |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145026/ https://www.ncbi.nlm.nih.gov/pubmed/33923177 http://dx.doi.org/10.3390/e23050518 |
work_keys_str_mv | AT komoriosamu aunifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage AT eguchishinto aunifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage AT komoriosamu unifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage AT eguchishinto unifiedformulationofkmeansfuzzycmeansandgaussianmixturemodelbythekolmogorovnagumoaverage |