Cargando…

GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data

Interactive visual exploration of large and multidimensional data still needs more efficient [Formula: see text] data embedding (DE) algorithms. We claim that the visualization of very high-dimensional data is equivalent to the problem of 2D embedding of undirected kNN-graphs. We demonstrate that hi...

Descripción completa

Detalles Bibliográficos
Autores principales: Minch, Bartosz, Nowak, Mateusz, Wcisło, Rafał, Dzwinel, Witold
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302810/
http://dx.doi.org/10.1007/978-3-030-50417-5_24
_version_ 1783547925989687296
author Minch, Bartosz
Nowak, Mateusz
Wcisło, Rafał
Dzwinel, Witold
author_facet Minch, Bartosz
Nowak, Mateusz
Wcisło, Rafał
Dzwinel, Witold
author_sort Minch, Bartosz
collection PubMed
description Interactive visual exploration of large and multidimensional data still needs more efficient [Formula: see text] data embedding (DE) algorithms. We claim that the visualization of very high-dimensional data is equivalent to the problem of 2D embedding of undirected kNN-graphs. We demonstrate that high quality embeddings can be produced with minimal time&memory complexity. A very efficient GPU version of IVHD (interactive visualization of high-dimensional data) algorithm is presented, and we compare it to the state-of-the-art GPU-implemented DE methods: BH-SNE-CUDA and AtSNE-CUDA. We show that memory and time requirements for IVHD-CUDA are radically lower than those for the baseline codes. For example, IVHD-CUDA is almost 30 times faster in embedding (without the procedure of kNN graph generation, which is the same for all the methods) of the largest ([Formula: see text]) YAHOO dataset than AtSNE-CUDA. We conclude that in the expense of minor deterioration of embedding quality, compared to the baseline algorithms, IVHD well preserves the main structural properties of ND data in 2D for radically lower computational budget. Thus, our method can be a good candidate for a truly big data ([Formula: see text]) interactive visualization.
format Online
Article
Text
id pubmed-7302810
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-73028102020-06-19 GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data Minch, Bartosz Nowak, Mateusz Wcisło, Rafał Dzwinel, Witold Computational Science – ICCS 2020 Article Interactive visual exploration of large and multidimensional data still needs more efficient [Formula: see text] data embedding (DE) algorithms. We claim that the visualization of very high-dimensional data is equivalent to the problem of 2D embedding of undirected kNN-graphs. We demonstrate that high quality embeddings can be produced with minimal time&memory complexity. A very efficient GPU version of IVHD (interactive visualization of high-dimensional data) algorithm is presented, and we compare it to the state-of-the-art GPU-implemented DE methods: BH-SNE-CUDA and AtSNE-CUDA. We show that memory and time requirements for IVHD-CUDA are radically lower than those for the baseline codes. For example, IVHD-CUDA is almost 30 times faster in embedding (without the procedure of kNN graph generation, which is the same for all the methods) of the largest ([Formula: see text]) YAHOO dataset than AtSNE-CUDA. We conclude that in the expense of minor deterioration of embedding quality, compared to the baseline algorithms, IVHD well preserves the main structural properties of ND data in 2D for radically lower computational budget. Thus, our method can be a good candidate for a truly big data ([Formula: see text]) interactive visualization. 2020-06-15 /pmc/articles/PMC7302810/ http://dx.doi.org/10.1007/978-3-030-50417-5_24 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Minch, Bartosz
Nowak, Mateusz
Wcisło, Rafał
Dzwinel, Witold
GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data
title GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data
title_full GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data
title_fullStr GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data
title_full_unstemmed GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data
title_short GPU-Embedding of kNN-Graph Representing Large and High-Dimensional Data
title_sort gpu-embedding of knn-graph representing large and high-dimensional data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302810/
http://dx.doi.org/10.1007/978-3-030-50417-5_24
work_keys_str_mv AT minchbartosz gpuembeddingofknngraphrepresentinglargeandhighdimensionaldata
AT nowakmateusz gpuembeddingofknngraphrepresentinglargeandhighdimensionaldata
AT wcisłorafał gpuembeddingofknngraphrepresentinglargeandhighdimensionaldata
AT dzwinelwitold gpuembeddingofknngraphrepresentinglargeandhighdimensionaldata