Cargando…

Information extraction pipelines for knowledge graphs

In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings t...

Descripción completa

Detalles Bibliográficos
Autores principales: Jaradeh, Mohamad Yaser, Singh, Kuldeep, Stocker, Markus, Both, Andreas, Auer, Sören
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer London 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9823264/
https://www.ncbi.nlm.nih.gov/pubmed/36643405
http://dx.doi.org/10.1007/s10115-022-01826-x
_version_ 1784866117685084160
author Jaradeh, Mohamad Yaser
Singh, Kuldeep
Stocker, Markus
Both, Andreas
Auer, Sören
author_facet Jaradeh, Mohamad Yaser
Singh, Kuldeep
Stocker, Markus
Both, Andreas
Auer, Sören
author_sort Jaradeh, Mohamad Yaser
collection PubMed
description In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber  to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
format Online
Article
Text
id pubmed-9823264
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer London
record_format MEDLINE/PubMed
spelling pubmed-98232642023-01-09 Information extraction pipelines for knowledge graphs Jaradeh, Mohamad Yaser Singh, Kuldeep Stocker, Markus Both, Andreas Auer, Sören Knowl Inf Syst Regular Paper In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber  to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations. Springer London 2023-01-07 2023 /pmc/articles/PMC9823264/ /pubmed/36643405 http://dx.doi.org/10.1007/s10115-022-01826-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Regular Paper
Jaradeh, Mohamad Yaser
Singh, Kuldeep
Stocker, Markus
Both, Andreas
Auer, Sören
Information extraction pipelines for knowledge graphs
title Information extraction pipelines for knowledge graphs
title_full Information extraction pipelines for knowledge graphs
title_fullStr Information extraction pipelines for knowledge graphs
title_full_unstemmed Information extraction pipelines for knowledge graphs
title_short Information extraction pipelines for knowledge graphs
title_sort information extraction pipelines for knowledge graphs
topic Regular Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9823264/
https://www.ncbi.nlm.nih.gov/pubmed/36643405
http://dx.doi.org/10.1007/s10115-022-01826-x
work_keys_str_mv AT jaradehmohamadyaser informationextractionpipelinesforknowledgegraphs
AT singhkuldeep informationextractionpipelinesforknowledgegraphs
AT stockermarkus informationextractionpipelinesforknowledgegraphs
AT bothandreas informationextractionpipelinesforknowledgegraphs
AT auersoren informationextractionpipelinesforknowledgegraphs