Cargando…

KG2Vec: A node2vec-based vectorization model for knowledge graph

Since the word2vec model was proposed, many researchers have vectorized the data in the research field based on it. In the field of social network, the Node2Vec model improved on the basis of word2vec can vectorize nodes and edges in social networks, so as to carry out relevant research on social ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, YueQun, Dong, LiYan, Jiang, XiaoQuan, Ma, XinTao, Li, YongLi, Zhang, Hao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8009404/
https://www.ncbi.nlm.nih.gov/pubmed/33784319
http://dx.doi.org/10.1371/journal.pone.0248552
_version_ 1783672868410753024
author Wang, YueQun
Dong, LiYan
Jiang, XiaoQuan
Ma, XinTao
Li, YongLi
Zhang, Hao
author_facet Wang, YueQun
Dong, LiYan
Jiang, XiaoQuan
Ma, XinTao
Li, YongLi
Zhang, Hao
author_sort Wang, YueQun
collection PubMed
description Since the word2vec model was proposed, many researchers have vectorized the data in the research field based on it. In the field of social network, the Node2Vec model improved on the basis of word2vec can vectorize nodes and edges in social networks, so as to carry out relevant research on social networks, such as link prediction, and community division. However, social network is a network with homogeneous structure. When dealing with heterogeneous networks such as knowledge graph, Node2Vec will lead to inaccurate prediction and unreasonable vector quantization data. Specifically, in the Node2Vec model, the walk strategy for homogeneous networks is not suitable for heterogeneous networks, because the latter has distinguishing features for nodes and edges. In this paper, a Heterogeneous Network vector representation method is proposed based on random walks and Node2Vec, called KG2vec (Heterogeneous Network to Vector) that solves problems related to the inadequate consideration of the full-text semantics and the contextual relations that are encountered by the traditional vector representation of the knowledge graph. First, the knowledge graph is reconstructed and a new random walk strategy is applied. Then, two training models and optimizing strategies are proposed, so that the contextual environment between entities and relations is obtained, semantically providing a full vector representation of the Heterogeneous Network. The experimental results show that the KG2VEC model solves the problem of insufficient context consideration and unsatisfactory results of one-to-many relationship in the vectorization process of the traditional knowledge graph. Our experiments show that KG2vec achieves better performance with higher accuracy than traditional methods.
format Online
Article
Text
id pubmed-8009404
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80094042021-04-07 KG2Vec: A node2vec-based vectorization model for knowledge graph Wang, YueQun Dong, LiYan Jiang, XiaoQuan Ma, XinTao Li, YongLi Zhang, Hao PLoS One Research Article Since the word2vec model was proposed, many researchers have vectorized the data in the research field based on it. In the field of social network, the Node2Vec model improved on the basis of word2vec can vectorize nodes and edges in social networks, so as to carry out relevant research on social networks, such as link prediction, and community division. However, social network is a network with homogeneous structure. When dealing with heterogeneous networks such as knowledge graph, Node2Vec will lead to inaccurate prediction and unreasonable vector quantization data. Specifically, in the Node2Vec model, the walk strategy for homogeneous networks is not suitable for heterogeneous networks, because the latter has distinguishing features for nodes and edges. In this paper, a Heterogeneous Network vector representation method is proposed based on random walks and Node2Vec, called KG2vec (Heterogeneous Network to Vector) that solves problems related to the inadequate consideration of the full-text semantics and the contextual relations that are encountered by the traditional vector representation of the knowledge graph. First, the knowledge graph is reconstructed and a new random walk strategy is applied. Then, two training models and optimizing strategies are proposed, so that the contextual environment between entities and relations is obtained, semantically providing a full vector representation of the Heterogeneous Network. The experimental results show that the KG2VEC model solves the problem of insufficient context consideration and unsatisfactory results of one-to-many relationship in the vectorization process of the traditional knowledge graph. Our experiments show that KG2vec achieves better performance with higher accuracy than traditional methods. Public Library of Science 2021-03-30 /pmc/articles/PMC8009404/ /pubmed/33784319 http://dx.doi.org/10.1371/journal.pone.0248552 Text en © 2021 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wang, YueQun
Dong, LiYan
Jiang, XiaoQuan
Ma, XinTao
Li, YongLi
Zhang, Hao
KG2Vec: A node2vec-based vectorization model for knowledge graph
title KG2Vec: A node2vec-based vectorization model for knowledge graph
title_full KG2Vec: A node2vec-based vectorization model for knowledge graph
title_fullStr KG2Vec: A node2vec-based vectorization model for knowledge graph
title_full_unstemmed KG2Vec: A node2vec-based vectorization model for knowledge graph
title_short KG2Vec: A node2vec-based vectorization model for knowledge graph
title_sort kg2vec: a node2vec-based vectorization model for knowledge graph
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8009404/
https://www.ncbi.nlm.nih.gov/pubmed/33784319
http://dx.doi.org/10.1371/journal.pone.0248552
work_keys_str_mv AT wangyuequn kg2vecanode2vecbasedvectorizationmodelforknowledgegraph
AT dongliyan kg2vecanode2vecbasedvectorizationmodelforknowledgegraph
AT jiangxiaoquan kg2vecanode2vecbasedvectorizationmodelforknowledgegraph
AT maxintao kg2vecanode2vecbasedvectorizationmodelforknowledgegraph
AT liyongli kg2vecanode2vecbasedvectorizationmodelforknowledgegraph
AT zhanghao kg2vecanode2vecbasedvectorizationmodelforknowledgegraph