Cargando…

FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion

Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. I...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Hongming, Wang, Xiaowen, Jiang, Yizhi, Fan, Hongfei, Du, Bowen, Liu, Qin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153108/
https://www.ncbi.nlm.nih.gov/pubmed/34068208
http://dx.doi.org/10.3390/e23050602
_version_ 1783698728328101888
author Zhu, Hongming
Wang, Xiaowen
Jiang, Yizhi
Fan, Hongfei
Du, Bowen
Liu, Qin
author_facet Zhu, Hongming
Wang, Xiaowen
Jiang, Yizhi
Fan, Hongfei
Du, Bowen
Liu, Qin
author_sort Zhu, Hongming
collection PubMed
description Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. In this paper, we propose a novel blocking algorithm named MultiObJ, which constructs indexes for instances based on the Ordered Joint of Multiple Objects’ features to limit the number of candidate instance pairs. Based on MultiObJ, we further propose a distributed framework named Follow-the-Regular-Leader Instance Matching (FTRLIM), which matches instances between large-scale knowledge graphs with approximately linear time complexity. FTRLIM has participated in OAEI 2019 and achieved the best matching quality with significantly efficiency. In this research, we construct three data collections based on a real-world large-scale knowledge graph. Experiment results on the constructed data collections and two real-world datasets indicate that MultiObJ and FTRLIM outperform other state-of-the-art methods.
format Online
Article
Text
id pubmed-8153108
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81531082021-05-27 FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion Zhu, Hongming Wang, Xiaowen Jiang, Yizhi Fan, Hongfei Du, Bowen Liu, Qin Entropy (Basel) Article Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. In this paper, we propose a novel blocking algorithm named MultiObJ, which constructs indexes for instances based on the Ordered Joint of Multiple Objects’ features to limit the number of candidate instance pairs. Based on MultiObJ, we further propose a distributed framework named Follow-the-Regular-Leader Instance Matching (FTRLIM), which matches instances between large-scale knowledge graphs with approximately linear time complexity. FTRLIM has participated in OAEI 2019 and achieved the best matching quality with significantly efficiency. In this research, we construct three data collections based on a real-world large-scale knowledge graph. Experiment results on the constructed data collections and two real-world datasets indicate that MultiObJ and FTRLIM outperform other state-of-the-art methods. MDPI 2021-05-13 /pmc/articles/PMC8153108/ /pubmed/34068208 http://dx.doi.org/10.3390/e23050602 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhu, Hongming
Wang, Xiaowen
Jiang, Yizhi
Fan, Hongfei
Du, Bowen
Liu, Qin
FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
title FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
title_full FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
title_fullStr FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
title_full_unstemmed FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
title_short FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
title_sort ftrlim: distributed instance matching framework for large-scale knowledge graph fusion
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153108/
https://www.ncbi.nlm.nih.gov/pubmed/34068208
http://dx.doi.org/10.3390/e23050602
work_keys_str_mv AT zhuhongming ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion
AT wangxiaowen ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion
AT jiangyizhi ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion
AT fanhongfei ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion
AT dubowen ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion
AT liuqin ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion