Cargando…

Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments

Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Rizzi, Romeo, Mahata, Pritha, Mathieson, Luke, Moscato, Pablo
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2997101/
https://www.ncbi.nlm.nih.gov/pubmed/21151943
http://dx.doi.org/10.1371/journal.pone.0014067
_version_ 1782193272059330560
author Rizzi, Romeo
Mahata, Pritha
Mathieson, Luke
Moscato, Pablo
author_facet Rizzi, Romeo
Mahata, Pritha
Mathieson, Luke
Moscato, Pablo
author_sort Rizzi, Romeo
collection PubMed
description Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is [Image: see text]-hard and [Image: see text]-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as [Image: see text]-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchal methods have in representing both intercluster differences and intracluster similarities.
format Text
id pubmed-2997101
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29971012010-12-10 Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments Rizzi, Romeo Mahata, Pritha Mathieson, Luke Moscato, Pablo PLoS One Research Article Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is [Image: see text]-hard and [Image: see text]-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as [Image: see text]-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchal methods have in representing both intercluster differences and intracluster similarities. Public Library of Science 2010-12-02 /pmc/articles/PMC2997101/ /pubmed/21151943 http://dx.doi.org/10.1371/journal.pone.0014067 Text en Rizzi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Rizzi, Romeo
Mahata, Pritha
Mathieson, Luke
Moscato, Pablo
Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments
title Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments
title_full Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments
title_fullStr Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments
title_full_unstemmed Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments
title_short Hierarchical Clustering Using the Arithmetic-Harmonic Cut: Complexity and Experiments
title_sort hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2997101/
https://www.ncbi.nlm.nih.gov/pubmed/21151943
http://dx.doi.org/10.1371/journal.pone.0014067
work_keys_str_mv AT rizziromeo hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
AT mahatapritha hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
AT mathiesonluke hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
AT moscatopablo hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments