Cargando…
Convex Clustering: An Attractive Alternative to Hierarchical Clustering
The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominan...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429070/ https://www.ncbi.nlm.nih.gov/pubmed/25965340 http://dx.doi.org/10.1371/journal.pcbi.1004228 |
_version_ | 1782370975596150784 |
---|---|
author | Chen, Gary K. Chi, Eric C. Ranola, John Michael O. Lange, Kenneth |
author_facet | Chen, Gary K. Chi, Eric C. Ranola, John Michael O. Lange, Kenneth |
author_sort | Chen, Gary K. |
collection | PubMed |
description | The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ |
format | Online Article Text |
id | pubmed-4429070 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-44290702015-05-21 Convex Clustering: An Attractive Alternative to Hierarchical Clustering Chen, Gary K. Chi, Eric C. Ranola, John Michael O. Lange, Kenneth PLoS Comput Biol Research Article The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ Public Library of Science 2015-05-12 /pmc/articles/PMC4429070/ /pubmed/25965340 http://dx.doi.org/10.1371/journal.pcbi.1004228 Text en © 2015 Chen et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Chen, Gary K. Chi, Eric C. Ranola, John Michael O. Lange, Kenneth Convex Clustering: An Attractive Alternative to Hierarchical Clustering |
title | Convex Clustering: An Attractive Alternative to Hierarchical Clustering |
title_full | Convex Clustering: An Attractive Alternative to Hierarchical Clustering |
title_fullStr | Convex Clustering: An Attractive Alternative to Hierarchical Clustering |
title_full_unstemmed | Convex Clustering: An Attractive Alternative to Hierarchical Clustering |
title_short | Convex Clustering: An Attractive Alternative to Hierarchical Clustering |
title_sort | convex clustering: an attractive alternative to hierarchical clustering |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429070/ https://www.ncbi.nlm.nih.gov/pubmed/25965340 http://dx.doi.org/10.1371/journal.pcbi.1004228 |
work_keys_str_mv | AT chengaryk convexclusteringanattractivealternativetohierarchicalclustering AT chiericc convexclusteringanattractivealternativetohierarchicalclustering AT ranolajohnmichaelo convexclusteringanattractivealternativetohierarchicalclustering AT langekenneth convexclusteringanattractivealternativetohierarchicalclustering |