Cargando…

Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data

Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for an...

Descripción completa

Detalles Bibliográficos
Autores principales: Shu, Guoping, Zeng, Beiyan, Chen, Yiping P., Smith, Oscar H.
Formato: Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2003
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2448457/
https://www.ncbi.nlm.nih.gov/pubmed/18629292
http://dx.doi.org/10.1002/cfg.290
_version_ 1782157140371177472
author Shu, Guoping
Zeng, Beiyan
Chen, Yiping P.
Smith, Oscar H.
author_facet Shu, Guoping
Zeng, Beiyan
Chen, Yiping P.
Smith, Oscar H.
author_sort Shu, Guoping
collection PubMed
description Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r(2) test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed.
format Text
id pubmed-2448457
institution National Center for Biotechnology Information
language English
publishDate 2003
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-24484572008-07-14 Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data Shu, Guoping Zeng, Beiyan Chen, Yiping P. Smith, Oscar H. Comp Funct Genomics Research Article Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r(2) test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. Hindawi Publishing Corporation 2003-06 /pmc/articles/PMC2448457/ /pubmed/18629292 http://dx.doi.org/10.1002/cfg.290 Text en Copyright © 2003 Hindawi Publishing Corporation. http://creativecommons.org/licenses/by/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Shu, Guoping
Zeng, Beiyan
Chen, Yiping P.
Smith, Oscar H.
Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
title Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
title_full Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
title_fullStr Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
title_full_unstemmed Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
title_short Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
title_sort performance assessment of kernel density clustering for gene expression profile data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2448457/
https://www.ncbi.nlm.nih.gov/pubmed/18629292
http://dx.doi.org/10.1002/cfg.290
work_keys_str_mv AT shuguoping performanceassessmentofkerneldensityclusteringforgeneexpressionprofiledata
AT zengbeiyan performanceassessmentofkerneldensityclusteringforgeneexpressionprofiledata
AT chenyipingp performanceassessmentofkerneldensityclusteringforgeneexpressionprofiledata
AT smithoscarh performanceassessmentofkerneldensityclusteringforgeneexpressionprofiledata