Cargando…

A practical comparison of two K-Means clustering algorithms

BACKGROUND: Data clustering is a powerful technique for identifying data with similar characteristics, such as genes with similar expression patterns. However, not all implementations of clustering algorithms yield the same performance or the same clusters. RESULTS: In this paper, we study two imple...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wilkin, Gregory A, Huang, Xiuzhen
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2423442/ https://www.ncbi.nlm.nih.gov/pubmed/18541054 http://dx.doi.org/10.1186/1471-2105-9-S6-S19

_version_	1782156100531912704
author	Wilkin, Gregory A Huang, Xiuzhen
author_facet	Wilkin, Gregory A Huang, Xiuzhen
author_sort	Wilkin, Gregory A
collection	PubMed
description	BACKGROUND: Data clustering is a powerful technique for identifying data with similar characteristics, such as genes with similar expression patterns. However, not all implementations of clustering algorithms yield the same performance or the same clusters. RESULTS: In this paper, we study two implementations of a general method for data clustering: k-means clustering. Our experimentation compares the running times and distance efficiency of Lloyd's K-means Clustering and the Progressive Greedy K-means Clustering. CONCLUSION: Based on our implementation, not just in processing time, but also in terms of mean squared-difference (MSD), Lloyd's K-means Clustering algorithm is more efficient. This analysis was performed using both a gene expression level sample and on randomly-generated datasets in three-dimensional space. However, other circumstances may dictate a different choice in some situations.
format	Text
id	pubmed-2423442
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-24234422008-06-11 A practical comparison of two K-Means clustering algorithms Wilkin, Gregory A Huang, Xiuzhen BMC Bioinformatics Research BACKGROUND: Data clustering is a powerful technique for identifying data with similar characteristics, such as genes with similar expression patterns. However, not all implementations of clustering algorithms yield the same performance or the same clusters. RESULTS: In this paper, we study two implementations of a general method for data clustering: k-means clustering. Our experimentation compares the running times and distance efficiency of Lloyd's K-means Clustering and the Progressive Greedy K-means Clustering. CONCLUSION: Based on our implementation, not just in processing time, but also in terms of mean squared-difference (MSD), Lloyd's K-means Clustering algorithm is more efficient. This analysis was performed using both a gene expression level sample and on randomly-generated datasets in three-dimensional space. However, other circumstances may dictate a different choice in some situations. BioMed Central 2008-05-28 /pmc/articles/PMC2423442/ /pubmed/18541054 http://dx.doi.org/10.1186/1471-2105-9-S6-S19 Text en Copyright © 2008 Wilkin and Huang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Wilkin, Gregory A Huang, Xiuzhen A practical comparison of two K-Means clustering algorithms
title	A practical comparison of two K-Means clustering algorithms
title_full	A practical comparison of two K-Means clustering algorithms
title_fullStr	A practical comparison of two K-Means clustering algorithms
title_full_unstemmed	A practical comparison of two K-Means clustering algorithms
title_short	A practical comparison of two K-Means clustering algorithms
title_sort	practical comparison of two k-means clustering algorithms
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2423442/ https://www.ncbi.nlm.nih.gov/pubmed/18541054 http://dx.doi.org/10.1186/1471-2105-9-S6-S19
work_keys_str_mv	AT wilkingregorya apracticalcomparisonoftwokmeansclusteringalgorithms AT huangxiuzhen apracticalcomparisonoftwokmeansclusteringalgorithms AT wilkingregorya practicalcomparisonoftwokmeansclusteringalgorithms AT huangxiuzhen practicalcomparisonoftwokmeansclusteringalgorithms

A practical comparison of two K-Means clustering algorithms

Ejemplares similares