Cargando…

Summarizing Finite Mixture Model with Overlapping Quantification

Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to c...

Descripción completa

Detalles Bibliográficos
Autores principales: Kyoya, Shunki, Yamanishi, Kenji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622449/
https://www.ncbi.nlm.nih.gov/pubmed/34828201
http://dx.doi.org/10.3390/e23111503
_version_ 1784605696394788864
author Kyoya, Shunki
Yamanishi, Kenji
author_facet Kyoya, Shunki
Yamanishi, Kenji
author_sort Kyoya, Shunki
collection PubMed
description Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to correctly understand the models. The primary purpose of this paper is to establish a theoretical framework for interpreting the overlapping mixture models by estimating how they overlap, using measures of information such as entropy and mutual information. This is achieved by merging components to regard multiple components as one cluster and summarizing the merging results. First, we propose three conditions that any merging criterion should satisfy. Then, we investigate whether several existing merging criteria satisfy the conditions and modify them to fulfill more conditions. Second, we propose a novel concept named clustering summarization to evaluate the merging results. In it, we can quantify how overlapped and biased the clusters are, using mutual information-based criteria. Using artificial and real datasets, we empirically demonstrate that our methods of modifying criteria and summarizing results are effective for understanding the cluster structures. We therefore give a new view of interpretability/explainability for model-based clustering.
format Online
Article
Text
id pubmed-8622449
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-86224492021-11-27 Summarizing Finite Mixture Model with Overlapping Quantification Kyoya, Shunki Yamanishi, Kenji Entropy (Basel) Article Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to correctly understand the models. The primary purpose of this paper is to establish a theoretical framework for interpreting the overlapping mixture models by estimating how they overlap, using measures of information such as entropy and mutual information. This is achieved by merging components to regard multiple components as one cluster and summarizing the merging results. First, we propose three conditions that any merging criterion should satisfy. Then, we investigate whether several existing merging criteria satisfy the conditions and modify them to fulfill more conditions. Second, we propose a novel concept named clustering summarization to evaluate the merging results. In it, we can quantify how overlapped and biased the clusters are, using mutual information-based criteria. Using artificial and real datasets, we empirically demonstrate that our methods of modifying criteria and summarizing results are effective for understanding the cluster structures. We therefore give a new view of interpretability/explainability for model-based clustering. MDPI 2021-11-13 /pmc/articles/PMC8622449/ /pubmed/34828201 http://dx.doi.org/10.3390/e23111503 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kyoya, Shunki
Yamanishi, Kenji
Summarizing Finite Mixture Model with Overlapping Quantification
title Summarizing Finite Mixture Model with Overlapping Quantification
title_full Summarizing Finite Mixture Model with Overlapping Quantification
title_fullStr Summarizing Finite Mixture Model with Overlapping Quantification
title_full_unstemmed Summarizing Finite Mixture Model with Overlapping Quantification
title_short Summarizing Finite Mixture Model with Overlapping Quantification
title_sort summarizing finite mixture model with overlapping quantification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622449/
https://www.ncbi.nlm.nih.gov/pubmed/34828201
http://dx.doi.org/10.3390/e23111503
work_keys_str_mv AT kyoyashunki summarizingfinitemixturemodelwithoverlappingquantification
AT yamanishikenji summarizingfinitemixturemodelwithoverlappingquantification