Cargando…

MCLEAN: Multilevel Clustering Exploration As Network

Finding useful patterns in datasets has attracted considerable interest in the field of visual analytics. One of the most common tasks is the identification and representation of clusters. However, this is non-trivial in heterogeneous datasets since the data needs to be analyzed from different persp...

Descripción completa

Detalles Bibliográficos
Autores principales: Alcaide, Daniel, Aerts, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924466/
https://www.ncbi.nlm.nih.gov/pubmed/33816801
http://dx.doi.org/10.7717/peerj-cs.145
_version_ 1783659095668031488
author Alcaide, Daniel
Aerts, Jan
author_facet Alcaide, Daniel
Aerts, Jan
author_sort Alcaide, Daniel
collection PubMed
description Finding useful patterns in datasets has attracted considerable interest in the field of visual analytics. One of the most common tasks is the identification and representation of clusters. However, this is non-trivial in heterogeneous datasets since the data needs to be analyzed from different perspectives. Indeed, highly variable patterns may mask underlying trends in the dataset. Dendrograms are graphical representations resulting from agglomerative hierarchical clustering and provide a framework for viewing the clustering at different levels of detail. However, dendrograms become cluttered when the dataset gets large, and the single cut of the dendrogram to demarcate different clusters can be insufficient in heterogeneous datasets. In this work, we propose a visual analytics methodology called MCLEAN that offers a general approach for guiding the user through the exploration and detection of clusters. Powered by a graph-based transformation of the relational data, it supports a scalable environment for representation of heterogeneous datasets by changing the spatialization. We thereby combine multilevel representations of the clustered dataset with community finding algorithms. Our approach entails displaying the results of the heuristics to users, providing a setting from which to start the exploration and data analysis. To evaluate our proposed approach, we conduct a qualitative user study, where participants are asked to explore a heterogeneous dataset, comparing the results obtained by MCLEAN with the dendrogram. These qualitative results reveal that MCLEAN is an effective way of aiding users in the detection of clusters in heterogeneous datasets. The proposed methodology is implemented in an R package available at https://bitbucket.org/vda-lab/mclean.
format Online
Article
Text
id pubmed-7924466
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-79244662021-04-02 MCLEAN: Multilevel Clustering Exploration As Network Alcaide, Daniel Aerts, Jan PeerJ Comput Sci Data Science Finding useful patterns in datasets has attracted considerable interest in the field of visual analytics. One of the most common tasks is the identification and representation of clusters. However, this is non-trivial in heterogeneous datasets since the data needs to be analyzed from different perspectives. Indeed, highly variable patterns may mask underlying trends in the dataset. Dendrograms are graphical representations resulting from agglomerative hierarchical clustering and provide a framework for viewing the clustering at different levels of detail. However, dendrograms become cluttered when the dataset gets large, and the single cut of the dendrogram to demarcate different clusters can be insufficient in heterogeneous datasets. In this work, we propose a visual analytics methodology called MCLEAN that offers a general approach for guiding the user through the exploration and detection of clusters. Powered by a graph-based transformation of the relational data, it supports a scalable environment for representation of heterogeneous datasets by changing the spatialization. We thereby combine multilevel representations of the clustered dataset with community finding algorithms. Our approach entails displaying the results of the heuristics to users, providing a setting from which to start the exploration and data analysis. To evaluate our proposed approach, we conduct a qualitative user study, where participants are asked to explore a heterogeneous dataset, comparing the results obtained by MCLEAN with the dendrogram. These qualitative results reveal that MCLEAN is an effective way of aiding users in the detection of clusters in heterogeneous datasets. The proposed methodology is implemented in an R package available at https://bitbucket.org/vda-lab/mclean. PeerJ Inc. 2018-01-29 /pmc/articles/PMC7924466/ /pubmed/33816801 http://dx.doi.org/10.7717/peerj-cs.145 Text en ©2018 Alcaide and Aerts http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Data Science
Alcaide, Daniel
Aerts, Jan
MCLEAN: Multilevel Clustering Exploration As Network
title MCLEAN: Multilevel Clustering Exploration As Network
title_full MCLEAN: Multilevel Clustering Exploration As Network
title_fullStr MCLEAN: Multilevel Clustering Exploration As Network
title_full_unstemmed MCLEAN: Multilevel Clustering Exploration As Network
title_short MCLEAN: Multilevel Clustering Exploration As Network
title_sort mclean: multilevel clustering exploration as network
topic Data Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924466/
https://www.ncbi.nlm.nih.gov/pubmed/33816801
http://dx.doi.org/10.7717/peerj-cs.145
work_keys_str_mv AT alcaidedaniel mcleanmultilevelclusteringexplorationasnetwork
AT aertsjan mcleanmultilevelclusteringexplorationasnetwork