Cargando…

A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set

Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based app...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Yi, Zhang, Yong, Kou, Gang, Shi, Yong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3411440/
https://www.ncbi.nlm.nih.gov/pubmed/22870181
http://dx.doi.org/10.1371/journal.pone.0041713
_version_ 1782239823108505600
author Peng, Yi
Zhang, Yong
Kou, Gang
Shi, Yong
author_facet Peng, Yi
Zhang, Yong
Kou, Gang
Shi, Yong
author_sort Peng, Yi
collection PubMed
description Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm–k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study.
format Online
Article
Text
id pubmed-3411440
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34114402012-08-06 A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set Peng, Yi Zhang, Yong Kou, Gang Shi, Yong PLoS One Research Article Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm–k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study. Public Library of Science 2012-07-27 /pmc/articles/PMC3411440/ /pubmed/22870181 http://dx.doi.org/10.1371/journal.pone.0041713 Text en © 2012 Peng et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Peng, Yi
Zhang, Yong
Kou, Gang
Shi, Yong
A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
title A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
title_full A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
title_fullStr A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
title_full_unstemmed A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
title_short A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
title_sort multicriteria decision making approach for estimating the number of clusters in a data set
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3411440/
https://www.ncbi.nlm.nih.gov/pubmed/22870181
http://dx.doi.org/10.1371/journal.pone.0041713
work_keys_str_mv AT pengyi amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT zhangyong amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT kougang amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT shiyong amulticriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT pengyi multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT zhangyong multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT kougang multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset
AT shiyong multicriteriadecisionmakingapproachforestimatingthenumberofclustersinadataset