Cargando…

Spectral Embedded Deep Clustering

We propose a new clustering method based on a deep neural network. Given an unlabeled dataset and the number of clusters, our method directly groups the dataset into the given number of clusters in the original space. We use a conditional discrete probability distribution defined by a deep neural ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Wada, Yuichiro, Miyamoto, Shugo, Nakagama, Takumi, Andéol, Léo, Kumagai, Wataru, Kanamori, Takafumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7515324/
https://www.ncbi.nlm.nih.gov/pubmed/33267508
http://dx.doi.org/10.3390/e21080795
_version_ 1783586791100514304
author Wada, Yuichiro
Miyamoto, Shugo
Nakagama, Takumi
Andéol, Léo
Kumagai, Wataru
Kanamori, Takafumi
author_facet Wada, Yuichiro
Miyamoto, Shugo
Nakagama, Takumi
Andéol, Léo
Kumagai, Wataru
Kanamori, Takafumi
author_sort Wada, Yuichiro
collection PubMed
description We propose a new clustering method based on a deep neural network. Given an unlabeled dataset and the number of clusters, our method directly groups the dataset into the given number of clusters in the original space. We use a conditional discrete probability distribution defined by a deep neural network as a statistical model. Our strategy is first to estimate the cluster labels of unlabeled data points selected from a high-density region, and then to conduct semi-supervised learning to train the model by using the estimated cluster labels and the remaining unlabeled data points. Lastly, by using the trained model, we obtain the estimated cluster labels of all given unlabeled data points. The advantage of our method is that it does not require key conditions. Existing clustering methods with deep neural networks assume that the cluster balance of a given dataset is uniform. Moreover, it also can be applied to various data domains as long as the data is expressed by a feature vector. In addition, it is observed that our method is robust against outliers. Therefore, the proposed method is expected to perform, on average, better than previous methods. We conducted numerical experiments on five commonly used datasets to confirm the effectiveness of the proposed method.
format Online
Article
Text
id pubmed-7515324
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75153242020-11-09 Spectral Embedded Deep Clustering Wada, Yuichiro Miyamoto, Shugo Nakagama, Takumi Andéol, Léo Kumagai, Wataru Kanamori, Takafumi Entropy (Basel) Article We propose a new clustering method based on a deep neural network. Given an unlabeled dataset and the number of clusters, our method directly groups the dataset into the given number of clusters in the original space. We use a conditional discrete probability distribution defined by a deep neural network as a statistical model. Our strategy is first to estimate the cluster labels of unlabeled data points selected from a high-density region, and then to conduct semi-supervised learning to train the model by using the estimated cluster labels and the remaining unlabeled data points. Lastly, by using the trained model, we obtain the estimated cluster labels of all given unlabeled data points. The advantage of our method is that it does not require key conditions. Existing clustering methods with deep neural networks assume that the cluster balance of a given dataset is uniform. Moreover, it also can be applied to various data domains as long as the data is expressed by a feature vector. In addition, it is observed that our method is robust against outliers. Therefore, the proposed method is expected to perform, on average, better than previous methods. We conducted numerical experiments on five commonly used datasets to confirm the effectiveness of the proposed method. MDPI 2019-08-15 /pmc/articles/PMC7515324/ /pubmed/33267508 http://dx.doi.org/10.3390/e21080795 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wada, Yuichiro
Miyamoto, Shugo
Nakagama, Takumi
Andéol, Léo
Kumagai, Wataru
Kanamori, Takafumi
Spectral Embedded Deep Clustering
title Spectral Embedded Deep Clustering
title_full Spectral Embedded Deep Clustering
title_fullStr Spectral Embedded Deep Clustering
title_full_unstemmed Spectral Embedded Deep Clustering
title_short Spectral Embedded Deep Clustering
title_sort spectral embedded deep clustering
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7515324/
https://www.ncbi.nlm.nih.gov/pubmed/33267508
http://dx.doi.org/10.3390/e21080795
work_keys_str_mv AT wadayuichiro spectralembeddeddeepclustering
AT miyamotoshugo spectralembeddeddeepclustering
AT nakagamatakumi spectralembeddeddeepclustering
AT andeolleo spectralembeddeddeepclustering
AT kumagaiwataru spectralembeddeddeepclustering
AT kanamoritakafumi spectralembeddeddeepclustering