Cargando…

Unsupervised Embedding Learning for Large-Scale Heterogeneous Networks Based on Metapath Graph Sampling

How to learn the embedding vectors of nodes in unsupervised large-scale heterogeneous networks is a key problem in heterogeneous network embedding research. This paper proposes an unsupervised embedding learning model, named LHGI (Large-scale Heterogeneous Graph Infomax). LHGI adopts the subgraph sa...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhong, Hongwei, Wang, Mingyang, Zhang, Xinyue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955212/
https://www.ncbi.nlm.nih.gov/pubmed/36832662
http://dx.doi.org/10.3390/e25020297
Descripción
Sumario:How to learn the embedding vectors of nodes in unsupervised large-scale heterogeneous networks is a key problem in heterogeneous network embedding research. This paper proposes an unsupervised embedding learning model, named LHGI (Large-scale Heterogeneous Graph Infomax). LHGI adopts the subgraph sampling technology under the guidance of metapaths, which can compress the network and retain the semantic information in the network as much as possible. At the same time, LHGI adopts the idea of contrastive learning, and takes the mutual information between normal/negative node vectors and the global graph vector as the objective function to guide the learning process. By maximizing the mutual information, LHGI solves the problem of how to train the network without supervised information. The experimental results show that, compared with the baseline models, the LHGI model shows a better feature extraction capability both in medium-scale unsupervised heterogeneous networks and in large-scale unsupervised heterogeneous networks. The node vectors generated by the LHGI model achieve better performance in the downstream mining tasks.