Cargando…

Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters

This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between di...

Descripción completa

Detalles Bibliográficos
Autores principales: Dashti, Ali, Komarov, Ivan, D’Souza, Roshan M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3781071/
https://www.ncbi.nlm.nih.gov/pubmed/24086314
http://dx.doi.org/10.1371/journal.pone.0074113
_version_ 1782285361903304704
author Dashti, Ali
Komarov, Ivan
D’Souza, Roshan M.
author_facet Dashti, Ali
Komarov, Ivan
D’Souza, Roshan M.
author_sort Dashti, Ali
collection PubMed
description This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Image: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
format Online
Article
Text
id pubmed-3781071
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-37810712013-10-01 Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters Dashti, Ali Komarov, Ivan D’Souza, Roshan M. PLoS One Research Article This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Image: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility. Public Library of Science 2013-09-23 /pmc/articles/PMC3781071/ /pubmed/24086314 http://dx.doi.org/10.1371/journal.pone.0074113 Text en © 2013 Dashti et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dashti, Ali
Komarov, Ivan
D’Souza, Roshan M.
Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
title Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
title_full Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
title_fullStr Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
title_full_unstemmed Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
title_short Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
title_sort efficient computation of k-nearest neighbour graphs for large high-dimensional data sets on gpu clusters
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3781071/
https://www.ncbi.nlm.nih.gov/pubmed/24086314
http://dx.doi.org/10.1371/journal.pone.0074113
work_keys_str_mv AT dashtiali efficientcomputationofknearestneighbourgraphsforlargehighdimensionaldatasetsongpuclusters
AT komarovivan efficientcomputationofknearestneighbourgraphsforlargehighdimensionaldatasetsongpuclusters
AT dsouzaroshanm efficientcomputationofknearestneighbourgraphsforlargehighdimensionaldatasetsongpuclusters