Cargando…

Clustering performance comparison using K-means and expectation maximization algorithms

Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K-means and the expectation maximization (...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jung, Yong Gyu, Kang, Min Soo, Heo, Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Taylor & Francis 2014
Materias:	Article; Bioinformatics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4433949/ https://www.ncbi.nlm.nih.gov/pubmed/26019610 http://dx.doi.org/10.1080/13102818.2014.949045

_version_	1782371693061210112
author	Jung, Yong Gyu Kang, Min Soo Heo, Jun
author_facet	Jung, Yong Gyu Kang, Min Soo Heo, Jun
author_sort	Jung, Yong Gyu
collection	PubMed
description	Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K-means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K-means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
format	Online Article Text
id	pubmed-4433949
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Taylor & Francis
record_format	MEDLINE/PubMed
spelling	pubmed-44339492015-05-25 Clustering performance comparison using K-means and expectation maximization algorithms Jung, Yong Gyu Kang, Min Soo Heo, Jun Biotechnol Biotechnol Equip Article; Bioinformatics Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K-means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K-means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results. Taylor & Francis 2014-11-14 2014-11-06 /pmc/articles/PMC4433949/ /pubmed/26019610 http://dx.doi.org/10.1080/13102818.2014.949045 Text en © 2014 The Author(s). Published by Taylor & Francis. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0/, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted.
spellingShingle	Article; Bioinformatics Jung, Yong Gyu Kang, Min Soo Heo, Jun Clustering performance comparison using K-means and expectation maximization algorithms
title	Clustering performance comparison using K-means and expectation maximization algorithms
title_full	Clustering performance comparison using K-means and expectation maximization algorithms
title_fullStr	Clustering performance comparison using K-means and expectation maximization algorithms
title_full_unstemmed	Clustering performance comparison using K-means and expectation maximization algorithms
title_short	Clustering performance comparison using K-means and expectation maximization algorithms
title_sort	clustering performance comparison using k-means and expectation maximization algorithms
topic	Article; Bioinformatics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4433949/ https://www.ncbi.nlm.nih.gov/pubmed/26019610 http://dx.doi.org/10.1080/13102818.2014.949045
work_keys_str_mv	AT jungyonggyu clusteringperformancecomparisonusingkmeansandexpectationmaximizationalgorithms AT kangminsoo clusteringperformancecomparisonusingkmeansandexpectationmaximizationalgorithms AT heojun clusteringperformancecomparisonusingkmeansandexpectationmaximizationalgorithms

Clustering performance comparison using K-means and expectation maximization algorithms

Ejemplares similares