Cargando…
A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to mul...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534904/ https://www.ncbi.nlm.nih.gov/pubmed/34681973 http://dx.doi.org/10.3390/e23101249 |
_version_ | 1784587656007516160 |
---|---|
author | Heo, Jinwon Baek, Jangsun |
author_facet | Heo, Jinwon Baek, Jangsun |
author_sort | Heo, Jinwon |
collection | PubMed |
description | Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to multi dimensional vectors and apply conventional clustering methods to them, and thus, suffer from an extreme high-dimensionality problem as well as a lack of interpretability of the correlated structure among row/column variables. Recently, a regularized model was proposed for clustering matrix-valued data by imposing a sparsity structure for the mean signal of each cluster. We extend their approach by regularizing further on the covariance to cope better with the curse of dimensionality for large size images. A penalized matrix normal mixture model with lasso-type penalty terms in both mean and covariance matrices is proposed, and then an expectation maximization algorithm is developed to estimate the parameters. The proposed method has the competence of both parsimonious modeling and reflecting the proper conditional correlation structure. The estimators are consistent, and their limiting distributions are derived. We applied the proposed method to simulated data as well as real datasets and measured its clustering performance with the clustering accuracy (ACC) and the adjusted rand index (ARI). The experiment results show that the proposed method performed better with higher ACC and ARI than those of conventional methods. |
format | Online Article Text |
id | pubmed-8534904 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-85349042021-10-23 A Penalized Matrix Normal Mixture Model for Clustering Matrix Data Heo, Jinwon Baek, Jangsun Entropy (Basel) Article Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to multi dimensional vectors and apply conventional clustering methods to them, and thus, suffer from an extreme high-dimensionality problem as well as a lack of interpretability of the correlated structure among row/column variables. Recently, a regularized model was proposed for clustering matrix-valued data by imposing a sparsity structure for the mean signal of each cluster. We extend their approach by regularizing further on the covariance to cope better with the curse of dimensionality for large size images. A penalized matrix normal mixture model with lasso-type penalty terms in both mean and covariance matrices is proposed, and then an expectation maximization algorithm is developed to estimate the parameters. The proposed method has the competence of both parsimonious modeling and reflecting the proper conditional correlation structure. The estimators are consistent, and their limiting distributions are derived. We applied the proposed method to simulated data as well as real datasets and measured its clustering performance with the clustering accuracy (ACC) and the adjusted rand index (ARI). The experiment results show that the proposed method performed better with higher ACC and ARI than those of conventional methods. MDPI 2021-09-26 /pmc/articles/PMC8534904/ /pubmed/34681973 http://dx.doi.org/10.3390/e23101249 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Heo, Jinwon Baek, Jangsun A Penalized Matrix Normal Mixture Model for Clustering Matrix Data |
title | A Penalized Matrix Normal Mixture Model for Clustering Matrix Data |
title_full | A Penalized Matrix Normal Mixture Model for Clustering Matrix Data |
title_fullStr | A Penalized Matrix Normal Mixture Model for Clustering Matrix Data |
title_full_unstemmed | A Penalized Matrix Normal Mixture Model for Clustering Matrix Data |
title_short | A Penalized Matrix Normal Mixture Model for Clustering Matrix Data |
title_sort | penalized matrix normal mixture model for clustering matrix data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534904/ https://www.ncbi.nlm.nih.gov/pubmed/34681973 http://dx.doi.org/10.3390/e23101249 |
work_keys_str_mv | AT heojinwon apenalizedmatrixnormalmixturemodelforclusteringmatrixdata AT baekjangsun apenalizedmatrixnormalmixturemodelforclusteringmatrixdata AT heojinwon penalizedmatrixnormalmixturemodelforclusteringmatrixdata AT baekjangsun penalizedmatrixnormalmixturemodelforclusteringmatrixdata |