Cargando…

Proximity-Based Compression for Network Embedding

Network embedding that encodes structural information of graphs into a low-dimensional vector space has been proven to be essential for network analysis applications, including node classification and community detection. Although recent methods show promising performance for various applications, g...

Descripción completa

Detalles Bibliográficos
Autores principales: Islam, Muhammad Ifte, Tanvir, Farhan, Johnson, Ginger, Akbas, Esra, Aktas, Mehmet Emin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931879/
https://www.ncbi.nlm.nih.gov/pubmed/33693427
http://dx.doi.org/10.3389/fdata.2020.608043
_version_ 1783660373571796992
author Islam, Muhammad Ifte
Tanvir, Farhan
Johnson, Ginger
Akbas, Esra
Aktas, Mehmet Emin
author_facet Islam, Muhammad Ifte
Tanvir, Farhan
Johnson, Ginger
Akbas, Esra
Aktas, Mehmet Emin
author_sort Islam, Muhammad Ifte
collection PubMed
description Network embedding that encodes structural information of graphs into a low-dimensional vector space has been proven to be essential for network analysis applications, including node classification and community detection. Although recent methods show promising performance for various applications, graph embedding still has some challenges; either the huge size of graphs may hinder a direct application of the existing network embedding method to them, or they suffer compromises in accuracy from locality and noise. In this paper, we propose a novel Network Embedding method, NECL, to generate embedding more efficiently or effectively. Our goal is to answer the following two questions: 1) Does the network Compression significantly boost Learning? 2) Does network compression improve the quality of the representation? For these goals, first, we propose a novel graph compression method based on the neighborhood similarity that compresses the input graph to a smaller graph with incorporating local proximity of its vertices into super-nodes; second, we employ the compressed graph for network embedding instead of the original large graph to bring down the embedding cost and also to capture the global structure of the original graph; third, we refine the embeddings from the compressed graph to the original graph. NECL is a general meta-strategy that improves the efficiency and effectiveness of many state-of-the-art graph embedding algorithms based on node proximity, including DeepWalk, Node2vec, and LINE. Extensive experiments validate the efficiency and effectiveness of our method, which decreases embedding time and improves classification accuracy as evaluated on single and multi-label classification tasks with large real-world graphs.
format Online
Article
Text
id pubmed-7931879
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79318792021-03-09 Proximity-Based Compression for Network Embedding Islam, Muhammad Ifte Tanvir, Farhan Johnson, Ginger Akbas, Esra Aktas, Mehmet Emin Front Big Data Big Data Network embedding that encodes structural information of graphs into a low-dimensional vector space has been proven to be essential for network analysis applications, including node classification and community detection. Although recent methods show promising performance for various applications, graph embedding still has some challenges; either the huge size of graphs may hinder a direct application of the existing network embedding method to them, or they suffer compromises in accuracy from locality and noise. In this paper, we propose a novel Network Embedding method, NECL, to generate embedding more efficiently or effectively. Our goal is to answer the following two questions: 1) Does the network Compression significantly boost Learning? 2) Does network compression improve the quality of the representation? For these goals, first, we propose a novel graph compression method based on the neighborhood similarity that compresses the input graph to a smaller graph with incorporating local proximity of its vertices into super-nodes; second, we employ the compressed graph for network embedding instead of the original large graph to bring down the embedding cost and also to capture the global structure of the original graph; third, we refine the embeddings from the compressed graph to the original graph. NECL is a general meta-strategy that improves the efficiency and effectiveness of many state-of-the-art graph embedding algorithms based on node proximity, including DeepWalk, Node2vec, and LINE. Extensive experiments validate the efficiency and effectiveness of our method, which decreases embedding time and improves classification accuracy as evaluated on single and multi-label classification tasks with large real-world graphs. Frontiers Media S.A. 2021-01-26 /pmc/articles/PMC7931879/ /pubmed/33693427 http://dx.doi.org/10.3389/fdata.2020.608043 Text en Copyright © 2021 Islam, Tanvir, Johnson, Akbas and Aktas. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (http://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Islam, Muhammad Ifte
Tanvir, Farhan
Johnson, Ginger
Akbas, Esra
Aktas, Mehmet Emin
Proximity-Based Compression for Network Embedding
title Proximity-Based Compression for Network Embedding
title_full Proximity-Based Compression for Network Embedding
title_fullStr Proximity-Based Compression for Network Embedding
title_full_unstemmed Proximity-Based Compression for Network Embedding
title_short Proximity-Based Compression for Network Embedding
title_sort proximity-based compression for network embedding
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931879/
https://www.ncbi.nlm.nih.gov/pubmed/33693427
http://dx.doi.org/10.3389/fdata.2020.608043
work_keys_str_mv AT islammuhammadifte proximitybasedcompressionfornetworkembedding
AT tanvirfarhan proximitybasedcompressionfornetworkembedding
AT johnsonginger proximitybasedcompressionfornetworkembedding
AT akbasesra proximitybasedcompressionfornetworkembedding
AT aktasmehmetemin proximitybasedcompressionfornetworkembedding