Cargando…

DHPV: a distributed algorithm for large-scale graph partitioning

Big graphs are part of the movement of “Not Only SQL” databases (also called NoSQL) focusing on the relationships between data, rather than the values themselves. The data is stored in vertices while the edges model the interactions or relationships between these data. They offer flexibility in hand...

Descripción completa

Detalles Bibliográficos
Autores principales:	Adoni, Wilfried Yves Hamilton, Nahhal, Tarik, Krichen, Moez, El byed, Abdeltif, Assayad, Ismail
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2020
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7493068/ https://www.ncbi.nlm.nih.gov/pubmed/32953386 http://dx.doi.org/10.1186/s40537-020-00357-y

_version_	1783582492131852288
author	Adoni, Wilfried Yves Hamilton Nahhal, Tarik Krichen, Moez El byed, Abdeltif Assayad, Ismail
author_facet	Adoni, Wilfried Yves Hamilton Nahhal, Tarik Krichen, Moez El byed, Abdeltif Assayad, Ismail
author_sort	Adoni, Wilfried Yves Hamilton
collection	PubMed
description	Big graphs are part of the movement of “Not Only SQL” databases (also called NoSQL) focusing on the relationships between data, rather than the values themselves. The data is stored in vertices while the edges model the interactions or relationships between these data. They offer flexibility in handling data that is strongly connected to each other. The analysis of a big graph generally involves exploring all of its vertices. Thus, this operation is costly in time and resources because big graphs are generally composed of millions of vertices connected through billions of edges. Consequently, the graph algorithms are expansive compared to the size of the big graph, and are therefore ineffective for data exploration. Thus, partitioning the graph stands out as an efficient and less expensive alternative for exploring a big graph. This technique consists in partitioning the graph into a set of k sub-graphs in order to reduce the complexity of the queries. Nevertheless, it presents many challenges because it is an NP-complete problem. In this article, we present DPHV (Distributed Placement of Hub-Vertices) an efficient parallel and distributed heuristic for large-scale graph partitioning. An application on a real-world graphs demonstrates the feasibility and reliability of our method. The experiments carried on a 10-nodes Spark cluster proved that the proposed methodology achieves significant gain in term of time and outperforms JA-BE-JA, Greedy, DFEP.
format	Online Article Text
id	pubmed-7493068
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-74930682020-09-16 DHPV: a distributed algorithm for large-scale graph partitioning Adoni, Wilfried Yves Hamilton Nahhal, Tarik Krichen, Moez El byed, Abdeltif Assayad, Ismail J Big Data Methodology Big graphs are part of the movement of “Not Only SQL” databases (also called NoSQL) focusing on the relationships between data, rather than the values themselves. The data is stored in vertices while the edges model the interactions or relationships between these data. They offer flexibility in handling data that is strongly connected to each other. The analysis of a big graph generally involves exploring all of its vertices. Thus, this operation is costly in time and resources because big graphs are generally composed of millions of vertices connected through billions of edges. Consequently, the graph algorithms are expansive compared to the size of the big graph, and are therefore ineffective for data exploration. Thus, partitioning the graph stands out as an efficient and less expensive alternative for exploring a big graph. This technique consists in partitioning the graph into a set of k sub-graphs in order to reduce the complexity of the queries. Nevertheless, it presents many challenges because it is an NP-complete problem. In this article, we present DPHV (Distributed Placement of Hub-Vertices) an efficient parallel and distributed heuristic for large-scale graph partitioning. An application on a real-world graphs demonstrates the feasibility and reliability of our method. The experiments carried on a 10-nodes Spark cluster proved that the proposed methodology achieves significant gain in term of time and outperforms JA-BE-JA, Greedy, DFEP. Springer International Publishing 2020-09-16 2020 /pmc/articles/PMC7493068/ /pubmed/32953386 http://dx.doi.org/10.1186/s40537-020-00357-y Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Methodology Adoni, Wilfried Yves Hamilton Nahhal, Tarik Krichen, Moez El byed, Abdeltif Assayad, Ismail DHPV: a distributed algorithm for large-scale graph partitioning
title	DHPV: a distributed algorithm for large-scale graph partitioning
title_full	DHPV: a distributed algorithm for large-scale graph partitioning
title_fullStr	DHPV: a distributed algorithm for large-scale graph partitioning
title_full_unstemmed	DHPV: a distributed algorithm for large-scale graph partitioning
title_short	DHPV: a distributed algorithm for large-scale graph partitioning
title_sort	dhpv: a distributed algorithm for large-scale graph partitioning
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7493068/ https://www.ncbi.nlm.nih.gov/pubmed/32953386 http://dx.doi.org/10.1186/s40537-020-00357-y
work_keys_str_mv	AT adoniwilfriedyveshamilton dhpvadistributedalgorithmforlargescalegraphpartitioning AT nahhaltarik dhpvadistributedalgorithmforlargescalegraphpartitioning AT krichenmoez dhpvadistributedalgorithmforlargescalegraphpartitioning AT elbyedabdeltif dhpvadistributedalgorithmforlargescalegraphpartitioning AT assayadismail dhpvadistributedalgorithmforlargescalegraphpartitioning

DHPV: a distributed algorithm for large-scale graph partitioning

Ejemplares similares