Cargando…
FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion
Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. I...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153108/ https://www.ncbi.nlm.nih.gov/pubmed/34068208 http://dx.doi.org/10.3390/e23050602 |
_version_ | 1783698728328101888 |
---|---|
author | Zhu, Hongming Wang, Xiaowen Jiang, Yizhi Fan, Hongfei Du, Bowen Liu, Qin |
author_facet | Zhu, Hongming Wang, Xiaowen Jiang, Yizhi Fan, Hongfei Du, Bowen Liu, Qin |
author_sort | Zhu, Hongming |
collection | PubMed |
description | Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. In this paper, we propose a novel blocking algorithm named MultiObJ, which constructs indexes for instances based on the Ordered Joint of Multiple Objects’ features to limit the number of candidate instance pairs. Based on MultiObJ, we further propose a distributed framework named Follow-the-Regular-Leader Instance Matching (FTRLIM), which matches instances between large-scale knowledge graphs with approximately linear time complexity. FTRLIM has participated in OAEI 2019 and achieved the best matching quality with significantly efficiency. In this research, we construct three data collections based on a real-world large-scale knowledge graph. Experiment results on the constructed data collections and two real-world datasets indicate that MultiObJ and FTRLIM outperform other state-of-the-art methods. |
format | Online Article Text |
id | pubmed-8153108 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-81531082021-05-27 FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion Zhu, Hongming Wang, Xiaowen Jiang, Yizhi Fan, Hongfei Du, Bowen Liu, Qin Entropy (Basel) Article Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. In this paper, we propose a novel blocking algorithm named MultiObJ, which constructs indexes for instances based on the Ordered Joint of Multiple Objects’ features to limit the number of candidate instance pairs. Based on MultiObJ, we further propose a distributed framework named Follow-the-Regular-Leader Instance Matching (FTRLIM), which matches instances between large-scale knowledge graphs with approximately linear time complexity. FTRLIM has participated in OAEI 2019 and achieved the best matching quality with significantly efficiency. In this research, we construct three data collections based on a real-world large-scale knowledge graph. Experiment results on the constructed data collections and two real-world datasets indicate that MultiObJ and FTRLIM outperform other state-of-the-art methods. MDPI 2021-05-13 /pmc/articles/PMC8153108/ /pubmed/34068208 http://dx.doi.org/10.3390/e23050602 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhu, Hongming Wang, Xiaowen Jiang, Yizhi Fan, Hongfei Du, Bowen Liu, Qin FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion |
title | FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion |
title_full | FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion |
title_fullStr | FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion |
title_full_unstemmed | FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion |
title_short | FTRLIM: Distributed Instance Matching Framework for Large-Scale Knowledge Graph Fusion |
title_sort | ftrlim: distributed instance matching framework for large-scale knowledge graph fusion |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153108/ https://www.ncbi.nlm.nih.gov/pubmed/34068208 http://dx.doi.org/10.3390/e23050602 |
work_keys_str_mv | AT zhuhongming ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion AT wangxiaowen ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion AT jiangyizhi ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion AT fanhongfei ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion AT dubowen ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion AT liuqin ftrlimdistributedinstancematchingframeworkforlargescaleknowledgegraphfusion |