Cargando…

Canonical PSO Based K-Means Clustering Approach for Real Datasets

“Clustering” the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dey, Lopamudra, Chakraborty, Sanjay
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi Publishing Corporation 2014
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4897525/ https://www.ncbi.nlm.nih.gov/pubmed/27355083 http://dx.doi.org/10.1155/2014/414013

_version_	1782436178805391360
author	Dey, Lopamudra Chakraborty, Sanjay
author_facet	Dey, Lopamudra Chakraborty, Sanjay
author_sort	Dey, Lopamudra
collection	PubMed
description	“Clustering” the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms.
format	Online Article Text
id	pubmed-4897525
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Hindawi Publishing Corporation
record_format	MEDLINE/PubMed
spelling	pubmed-48975252016-06-28 Canonical PSO Based K-Means Clustering Approach for Real Datasets Dey, Lopamudra Chakraborty, Sanjay Int Sch Res Notices Research Article “Clustering” the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms. Hindawi Publishing Corporation 2014-11-12 /pmc/articles/PMC4897525/ /pubmed/27355083 http://dx.doi.org/10.1155/2014/414013 Text en Copyright © 2014 L. Dey and S. Chakraborty. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Dey, Lopamudra Chakraborty, Sanjay Canonical PSO Based K-Means Clustering Approach for Real Datasets
title	Canonical PSO Based K-Means Clustering Approach for Real Datasets
title_full	Canonical PSO Based K-Means Clustering Approach for Real Datasets
title_fullStr	Canonical PSO Based K-Means Clustering Approach for Real Datasets
title_full_unstemmed	Canonical PSO Based K-Means Clustering Approach for Real Datasets
title_short	Canonical PSO Based K-Means Clustering Approach for Real Datasets
title_sort	canonical pso based k-means clustering approach for real datasets
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4897525/ https://www.ncbi.nlm.nih.gov/pubmed/27355083 http://dx.doi.org/10.1155/2014/414013
work_keys_str_mv	AT deylopamudra canonicalpsobasedkmeansclusteringapproachforrealdatasets AT chakrabortysanjay canonicalpsobasedkmeansclusteringapproachforrealdatasets

Canonical PSO Based K-Means Clustering Approach for Real Datasets

Ejemplares similares