Cargando…

Clustering Using Boosted Constrained k-Means Algorithm

This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowle...

Descripción completa

Detalles Bibliográficos
Autores principales: Okabe, Masayuki, Yamada, Seiji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805652/
https://www.ncbi.nlm.nih.gov/pubmed/33500905
http://dx.doi.org/10.3389/frobt.2018.00018
_version_ 1783636348913057792
author Okabe, Masayuki
Yamada, Seiji
author_facet Okabe, Masayuki
Yamada, Seiji
author_sort Okabe, Masayuki
collection PubMed
description This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowledge, although easy to implement and quick, has insufficient performance compared with metric learning-based methods. Since it simply adds a function into the data assignment process of the k-means algorithm to check for constraint violations, it often exploits only a small number of constraints. Metric learning-based methods, which exploit constraints to create a new metric for data similarity, have shown promising results although the methods proposed so far are often slow depending on the amount of data or number of feature dimensions. We present a method that exploits the advantages of the constrained k-means and metric learning approaches. It incorporates a mechanism for accepting constraint priorities and a metric learning framework based on the boosting principle into a constrained k-means algorithm. In the framework, a metric is learned in the form of a kernel matrix that integrates weak cluster hypotheses produced by the constrained k-means algorithm, which works as a weak learner under the boosting principle. Experimental results for 12 data sets from 3 data sources demonstrated that our method has performance competitive to those of state-of-the-art constrained clustering methods for most data sets and that it takes much less computation time. Experimental evaluation demonstrated the effectiveness of controlling the constraint priorities by using the boosting principle and that our constrained k-means algorithm functions correctly as a weak learner of boosting.
format Online
Article
Text
id pubmed-7805652
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-78056522021-01-25 Clustering Using Boosted Constrained k-Means Algorithm Okabe, Masayuki Yamada, Seiji Front Robot AI Robotics and AI This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowledge, although easy to implement and quick, has insufficient performance compared with metric learning-based methods. Since it simply adds a function into the data assignment process of the k-means algorithm to check for constraint violations, it often exploits only a small number of constraints. Metric learning-based methods, which exploit constraints to create a new metric for data similarity, have shown promising results although the methods proposed so far are often slow depending on the amount of data or number of feature dimensions. We present a method that exploits the advantages of the constrained k-means and metric learning approaches. It incorporates a mechanism for accepting constraint priorities and a metric learning framework based on the boosting principle into a constrained k-means algorithm. In the framework, a metric is learned in the form of a kernel matrix that integrates weak cluster hypotheses produced by the constrained k-means algorithm, which works as a weak learner under the boosting principle. Experimental results for 12 data sets from 3 data sources demonstrated that our method has performance competitive to those of state-of-the-art constrained clustering methods for most data sets and that it takes much less computation time. Experimental evaluation demonstrated the effectiveness of controlling the constraint priorities by using the boosting principle and that our constrained k-means algorithm functions correctly as a weak learner of boosting. Frontiers Media S.A. 2018-03-08 /pmc/articles/PMC7805652/ /pubmed/33500905 http://dx.doi.org/10.3389/frobt.2018.00018 Text en Copyright © 2018 Okabe and Yamada. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Robotics and AI
Okabe, Masayuki
Yamada, Seiji
Clustering Using Boosted Constrained k-Means Algorithm
title Clustering Using Boosted Constrained k-Means Algorithm
title_full Clustering Using Boosted Constrained k-Means Algorithm
title_fullStr Clustering Using Boosted Constrained k-Means Algorithm
title_full_unstemmed Clustering Using Boosted Constrained k-Means Algorithm
title_short Clustering Using Boosted Constrained k-Means Algorithm
title_sort clustering using boosted constrained k-means algorithm
topic Robotics and AI
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805652/
https://www.ncbi.nlm.nih.gov/pubmed/33500905
http://dx.doi.org/10.3389/frobt.2018.00018
work_keys_str_mv AT okabemasayuki clusteringusingboostedconstrainedkmeansalgorithm
AT yamadaseiji clusteringusingboostedconstrainedkmeansalgorithm