Cargando…
A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics
The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k-means clustering technique—the Fast, Efficient, and Scalable k-m...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171363/ https://www.ncbi.nlm.nih.gov/pubmed/20689710 http://dx.doi.org/10.1155/2010/746021 |
_version_ | 1782211745731837952 |
---|---|
author | Oyana, Tonny J |
author_facet | Oyana, Tonny J |
author_sort | Oyana, Tonny J |
collection | PubMed |
description | The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k-means clustering technique—the Fast, Efficient, and Scalable k-means algorithm (FES-k-means). The FES-k-means algorithm uses a hybrid approach that comprises the k-d tree data structure that enhances the nearest neighbor query, the original k-means algorithm, and an adaptation rate proposed by Mashor. This algorithm was tested using two real datasets and one synthetic dataset. It was employed twice on all three datasets: once on data trained by the innovative MIL-SOM method and then on the actual untrained data in order to evaluate its competence. This two-step approach of data training prior to clustering provides a solid foundation for knowledge discovery and data mining, otherwise unclaimed by clustering methods alone. The benefits of this method are that it produces clusters similar to the original k-means method at a much faster rate as shown by runtime comparison data; and it provides efficient analysis of large geospatial data with implications for disease mechanism discovery. From a disease mechanism discovery perspective, it is hypothesized that the linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines. |
format | Online Article Text |
id | pubmed-3171363 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Springer |
record_format | MEDLINE/PubMed |
spelling | pubmed-31713632011-09-13 A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics Oyana, Tonny J EURASIP J Bioinform Syst Biol Research Article The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k-means clustering technique—the Fast, Efficient, and Scalable k-means algorithm (FES-k-means). The FES-k-means algorithm uses a hybrid approach that comprises the k-d tree data structure that enhances the nearest neighbor query, the original k-means algorithm, and an adaptation rate proposed by Mashor. This algorithm was tested using two real datasets and one synthetic dataset. It was employed twice on all three datasets: once on data trained by the innovative MIL-SOM method and then on the actual untrained data in order to evaluate its competence. This two-step approach of data training prior to clustering provides a solid foundation for knowledge discovery and data mining, otherwise unclaimed by clustering methods alone. The benefits of this method are that it produces clusters similar to the original k-means method at a much faster rate as shown by runtime comparison data; and it provides efficient analysis of large geospatial data with implications for disease mechanism discovery. From a disease mechanism discovery perspective, it is hypothesized that the linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines. Springer 2010-06-08 /pmc/articles/PMC3171363/ /pubmed/20689710 http://dx.doi.org/10.1155/2010/746021 Text en Copyright © 2010 Tonny J. Oyana. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Oyana, Tonny J A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics |
title | A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics |
title_full | A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics |
title_fullStr | A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics |
title_full_unstemmed | A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics |
title_short | A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics |
title_sort | new-fangled fes-k-means clustering algorithm for disease discovery and visual analytics |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171363/ https://www.ncbi.nlm.nih.gov/pubmed/20689710 http://dx.doi.org/10.1155/2010/746021 |
work_keys_str_mv | AT oyanatonnyj anewfangledfeskmeansclusteringalgorithmfordiseasediscoveryandvisualanalytics AT oyanatonnyj newfangledfeskmeansclusteringalgorithmfordiseasediscoveryandvisualanalytics |