Cargando…

A Penalized Matrix Normal Mixture Model for Clustering Matrix Data

Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to mul...

Descripción completa

Detalles Bibliográficos
Autores principales: Heo, Jinwon, Baek, Jangsun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534904/
https://www.ncbi.nlm.nih.gov/pubmed/34681973
http://dx.doi.org/10.3390/e23101249
_version_ 1784587656007516160
author Heo, Jinwon
Baek, Jangsun
author_facet Heo, Jinwon
Baek, Jangsun
author_sort Heo, Jinwon
collection PubMed
description Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to multi dimensional vectors and apply conventional clustering methods to them, and thus, suffer from an extreme high-dimensionality problem as well as a lack of interpretability of the correlated structure among row/column variables. Recently, a regularized model was proposed for clustering matrix-valued data by imposing a sparsity structure for the mean signal of each cluster. We extend their approach by regularizing further on the covariance to cope better with the curse of dimensionality for large size images. A penalized matrix normal mixture model with lasso-type penalty terms in both mean and covariance matrices is proposed, and then an expectation maximization algorithm is developed to estimate the parameters. The proposed method has the competence of both parsimonious modeling and reflecting the proper conditional correlation structure. The estimators are consistent, and their limiting distributions are derived. We applied the proposed method to simulated data as well as real datasets and measured its clustering performance with the clustering accuracy (ACC) and the adjusted rand index (ARI). The experiment results show that the proposed method performed better with higher ACC and ARI than those of conventional methods.
format Online
Article
Text
id pubmed-8534904
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85349042021-10-23 A Penalized Matrix Normal Mixture Model for Clustering Matrix Data Heo, Jinwon Baek, Jangsun Entropy (Basel) Article Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to multi dimensional vectors and apply conventional clustering methods to them, and thus, suffer from an extreme high-dimensionality problem as well as a lack of interpretability of the correlated structure among row/column variables. Recently, a regularized model was proposed for clustering matrix-valued data by imposing a sparsity structure for the mean signal of each cluster. We extend their approach by regularizing further on the covariance to cope better with the curse of dimensionality for large size images. A penalized matrix normal mixture model with lasso-type penalty terms in both mean and covariance matrices is proposed, and then an expectation maximization algorithm is developed to estimate the parameters. The proposed method has the competence of both parsimonious modeling and reflecting the proper conditional correlation structure. The estimators are consistent, and their limiting distributions are derived. We applied the proposed method to simulated data as well as real datasets and measured its clustering performance with the clustering accuracy (ACC) and the adjusted rand index (ARI). The experiment results show that the proposed method performed better with higher ACC and ARI than those of conventional methods. MDPI 2021-09-26 /pmc/articles/PMC8534904/ /pubmed/34681973 http://dx.doi.org/10.3390/e23101249 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Heo, Jinwon
Baek, Jangsun
A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
title A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
title_full A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
title_fullStr A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
title_full_unstemmed A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
title_short A Penalized Matrix Normal Mixture Model for Clustering Matrix Data
title_sort penalized matrix normal mixture model for clustering matrix data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8534904/
https://www.ncbi.nlm.nih.gov/pubmed/34681973
http://dx.doi.org/10.3390/e23101249
work_keys_str_mv AT heojinwon apenalizedmatrixnormalmixturemodelforclusteringmatrixdata
AT baekjangsun apenalizedmatrixnormalmixturemodelforclusteringmatrixdata
AT heojinwon penalizedmatrixnormalmixturemodelforclusteringmatrixdata
AT baekjangsun penalizedmatrixnormalmixturemodelforclusteringmatrixdata