Cargando…

HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding

Most Heterogeneous Information Network (HIN) embedding methods use meta-paths to guide random walks to sample from HIN and perform representation learning in order to overcome the bias of traditional random walks that are more biased towards high-order nodes. Their performance depends on the suitabi...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Zhenpeng, Zhang, Shengcong, Zhang, Jialiang, Jiang, Mingxiao, Liu, Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10378489/
https://www.ncbi.nlm.nih.gov/pubmed/37509945
http://dx.doi.org/10.3390/e25070998
_version_ 1785079779400089600
author Liu, Zhenpeng
Zhang, Shengcong
Zhang, Jialiang
Jiang, Mingxiao
Liu, Yi
author_facet Liu, Zhenpeng
Zhang, Shengcong
Zhang, Jialiang
Jiang, Mingxiao
Liu, Yi
author_sort Liu, Zhenpeng
collection PubMed
description Most Heterogeneous Information Network (HIN) embedding methods use meta-paths to guide random walks to sample from HIN and perform representation learning in order to overcome the bias of traditional random walks that are more biased towards high-order nodes. Their performance depends on the suitability of the generated meta-paths for the current HIN. The definition of meta-paths requires domain expertise, which makes the results overly dependent on the meta-paths. Moreover, it is difficult to represent the structure of complex HIN with a single meta-path. In a meta-path guided random walk, some of the heterogeneous structures (e.g., node type(s)) are not among the node types specified by the meta-path, making this heterogeneous information ignored. In this paper, HeteEdgeWalk, a solution method that does not involve meta-paths, is proposed. We design a dynamically adjusted bidirectional edge-sampling walk strategy. Specifically, edge sampling and the storage of recently selected edge types are used to better sample the network structure in a more balanced and comprehensive way. Finally, node classification and clustering experiments are performed on four real HINs with in-depth analysis. The results show a maximum performance improvement of 2% in node classification and at least 0.6% in clustering compared to baselines. This demonstrates the superiority of the method to effectively capture semantic information from HINs.
format Online
Article
Text
id pubmed-10378489
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-103784892023-07-29 HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding Liu, Zhenpeng Zhang, Shengcong Zhang, Jialiang Jiang, Mingxiao Liu, Yi Entropy (Basel) Article Most Heterogeneous Information Network (HIN) embedding methods use meta-paths to guide random walks to sample from HIN and perform representation learning in order to overcome the bias of traditional random walks that are more biased towards high-order nodes. Their performance depends on the suitability of the generated meta-paths for the current HIN. The definition of meta-paths requires domain expertise, which makes the results overly dependent on the meta-paths. Moreover, it is difficult to represent the structure of complex HIN with a single meta-path. In a meta-path guided random walk, some of the heterogeneous structures (e.g., node type(s)) are not among the node types specified by the meta-path, making this heterogeneous information ignored. In this paper, HeteEdgeWalk, a solution method that does not involve meta-paths, is proposed. We design a dynamically adjusted bidirectional edge-sampling walk strategy. Specifically, edge sampling and the storage of recently selected edge types are used to better sample the network structure in a more balanced and comprehensive way. Finally, node classification and clustering experiments are performed on four real HINs with in-depth analysis. The results show a maximum performance improvement of 2% in node classification and at least 0.6% in clustering compared to baselines. This demonstrates the superiority of the method to effectively capture semantic information from HINs. MDPI 2023-06-29 /pmc/articles/PMC10378489/ /pubmed/37509945 http://dx.doi.org/10.3390/e25070998 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Zhenpeng
Zhang, Shengcong
Zhang, Jialiang
Jiang, Mingxiao
Liu, Yi
HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding
title HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding
title_full HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding
title_fullStr HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding
title_full_unstemmed HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding
title_short HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding
title_sort heteedgewalk: a heterogeneous edge memory random walk for heterogeneous information network embedding
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10378489/
https://www.ncbi.nlm.nih.gov/pubmed/37509945
http://dx.doi.org/10.3390/e25070998
work_keys_str_mv AT liuzhenpeng heteedgewalkaheterogeneousedgememoryrandomwalkforheterogeneousinformationnetworkembedding
AT zhangshengcong heteedgewalkaheterogeneousedgememoryrandomwalkforheterogeneousinformationnetworkembedding
AT zhangjialiang heteedgewalkaheterogeneousedgememoryrandomwalkforheterogeneousinformationnetworkembedding
AT jiangmingxiao heteedgewalkaheterogeneousedgememoryrandomwalkforheterogeneousinformationnetworkembedding
AT liuyi heteedgewalkaheterogeneousedgememoryrandomwalkforheterogeneousinformationnetworkembedding