Cargando…
Clustering Using Boosted Constrained k-Means Algorithm
This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowle...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805652/ https://www.ncbi.nlm.nih.gov/pubmed/33500905 http://dx.doi.org/10.3389/frobt.2018.00018 |
_version_ | 1783636348913057792 |
---|---|
author | Okabe, Masayuki Yamada, Seiji |
author_facet | Okabe, Masayuki Yamada, Seiji |
author_sort | Okabe, Masayuki |
collection | PubMed |
description | This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowledge, although easy to implement and quick, has insufficient performance compared with metric learning-based methods. Since it simply adds a function into the data assignment process of the k-means algorithm to check for constraint violations, it often exploits only a small number of constraints. Metric learning-based methods, which exploit constraints to create a new metric for data similarity, have shown promising results although the methods proposed so far are often slow depending on the amount of data or number of feature dimensions. We present a method that exploits the advantages of the constrained k-means and metric learning approaches. It incorporates a mechanism for accepting constraint priorities and a metric learning framework based on the boosting principle into a constrained k-means algorithm. In the framework, a metric is learned in the form of a kernel matrix that integrates weak cluster hypotheses produced by the constrained k-means algorithm, which works as a weak learner under the boosting principle. Experimental results for 12 data sets from 3 data sources demonstrated that our method has performance competitive to those of state-of-the-art constrained clustering methods for most data sets and that it takes much less computation time. Experimental evaluation demonstrated the effectiveness of controlling the constraint priorities by using the boosting principle and that our constrained k-means algorithm functions correctly as a weak learner of boosting. |
format | Online Article Text |
id | pubmed-7805652 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78056522021-01-25 Clustering Using Boosted Constrained k-Means Algorithm Okabe, Masayuki Yamada, Seiji Front Robot AI Robotics and AI This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowledge, although easy to implement and quick, has insufficient performance compared with metric learning-based methods. Since it simply adds a function into the data assignment process of the k-means algorithm to check for constraint violations, it often exploits only a small number of constraints. Metric learning-based methods, which exploit constraints to create a new metric for data similarity, have shown promising results although the methods proposed so far are often slow depending on the amount of data or number of feature dimensions. We present a method that exploits the advantages of the constrained k-means and metric learning approaches. It incorporates a mechanism for accepting constraint priorities and a metric learning framework based on the boosting principle into a constrained k-means algorithm. In the framework, a metric is learned in the form of a kernel matrix that integrates weak cluster hypotheses produced by the constrained k-means algorithm, which works as a weak learner under the boosting principle. Experimental results for 12 data sets from 3 data sources demonstrated that our method has performance competitive to those of state-of-the-art constrained clustering methods for most data sets and that it takes much less computation time. Experimental evaluation demonstrated the effectiveness of controlling the constraint priorities by using the boosting principle and that our constrained k-means algorithm functions correctly as a weak learner of boosting. Frontiers Media S.A. 2018-03-08 /pmc/articles/PMC7805652/ /pubmed/33500905 http://dx.doi.org/10.3389/frobt.2018.00018 Text en Copyright © 2018 Okabe and Yamada. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Robotics and AI Okabe, Masayuki Yamada, Seiji Clustering Using Boosted Constrained k-Means Algorithm |
title | Clustering Using Boosted Constrained k-Means Algorithm |
title_full | Clustering Using Boosted Constrained k-Means Algorithm |
title_fullStr | Clustering Using Boosted Constrained k-Means Algorithm |
title_full_unstemmed | Clustering Using Boosted Constrained k-Means Algorithm |
title_short | Clustering Using Boosted Constrained k-Means Algorithm |
title_sort | clustering using boosted constrained k-means algorithm |
topic | Robotics and AI |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805652/ https://www.ncbi.nlm.nih.gov/pubmed/33500905 http://dx.doi.org/10.3389/frobt.2018.00018 |
work_keys_str_mv | AT okabemasayuki clusteringusingboostedconstrainedkmeansalgorithm AT yamadaseiji clusteringusingboostedconstrainedkmeansalgorithm |