Cargando…

Named entity disambiguation in short texts over knowledge graphs

The ever-growing usage of knowledge graphs (KGs) positions named entity disambiguation (NED) at the heart of designing accurate KG-driven systems such as query answering systems (QAS). According to the current research, most studies dealing with NED on KGs involve long texts, which is not the case o...

Descripción completa

Detalles Bibliográficos
Autores principales: Bouarroudj, Wissem, Boufaida, Zizette, Bellatreche, Ladjel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer London 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8722665/
https://www.ncbi.nlm.nih.gov/pubmed/35001999
http://dx.doi.org/10.1007/s10115-021-01642-9
_version_ 1784625561407062016
author Bouarroudj, Wissem
Boufaida, Zizette
Bellatreche, Ladjel
author_facet Bouarroudj, Wissem
Boufaida, Zizette
Bellatreche, Ladjel
author_sort Bouarroudj, Wissem
collection PubMed
description The ever-growing usage of knowledge graphs (KGs) positions named entity disambiguation (NED) at the heart of designing accurate KG-driven systems such as query answering systems (QAS). According to the current research, most studies dealing with NED on KGs involve long texts, which is not the case of short text fragments, identified by their limited contexts. The accuracy of QASs strongly depends on the management of such short text. This limitation motivates this paper, which studies the NED problem on KGs, involving only short texts. First, we propose a NED approach including the following steps: (i) context expansion using WordNet to measure its similarity to the resource context. (ii) Exploiting coherence between entities in queries that contain more than one entity, such as “Is Michelle Obama the wife of Barack Obama?”. (iii) Taking into account the relations between words to calculate their similarity with the properties of a resource. (iv) the use of syntactic features. The NED solution approach is compared to state-of-the-art approaches using five datasets. The experimental results show that our approach outperforms these systems by 27% in the F-measure. A system called Welink, implementing our proposal, is available on GitHub, and it is also accessible via a REST API.
format Online
Article
Text
id pubmed-8722665
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer London
record_format MEDLINE/PubMed
spelling pubmed-87226652022-01-04 Named entity disambiguation in short texts over knowledge graphs Bouarroudj, Wissem Boufaida, Zizette Bellatreche, Ladjel Knowl Inf Syst Regular Paper The ever-growing usage of knowledge graphs (KGs) positions named entity disambiguation (NED) at the heart of designing accurate KG-driven systems such as query answering systems (QAS). According to the current research, most studies dealing with NED on KGs involve long texts, which is not the case of short text fragments, identified by their limited contexts. The accuracy of QASs strongly depends on the management of such short text. This limitation motivates this paper, which studies the NED problem on KGs, involving only short texts. First, we propose a NED approach including the following steps: (i) context expansion using WordNet to measure its similarity to the resource context. (ii) Exploiting coherence between entities in queries that contain more than one entity, such as “Is Michelle Obama the wife of Barack Obama?”. (iii) Taking into account the relations between words to calculate their similarity with the properties of a resource. (iv) the use of syntactic features. The NED solution approach is compared to state-of-the-art approaches using five datasets. The experimental results show that our approach outperforms these systems by 27% in the F-measure. A system called Welink, implementing our proposal, is available on GitHub, and it is also accessible via a REST API. Springer London 2022-01-03 2022 /pmc/articles/PMC8722665/ /pubmed/35001999 http://dx.doi.org/10.1007/s10115-021-01642-9 Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Regular Paper
Bouarroudj, Wissem
Boufaida, Zizette
Bellatreche, Ladjel
Named entity disambiguation in short texts over knowledge graphs
title Named entity disambiguation in short texts over knowledge graphs
title_full Named entity disambiguation in short texts over knowledge graphs
title_fullStr Named entity disambiguation in short texts over knowledge graphs
title_full_unstemmed Named entity disambiguation in short texts over knowledge graphs
title_short Named entity disambiguation in short texts over knowledge graphs
title_sort named entity disambiguation in short texts over knowledge graphs
topic Regular Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8722665/
https://www.ncbi.nlm.nih.gov/pubmed/35001999
http://dx.doi.org/10.1007/s10115-021-01642-9
work_keys_str_mv AT bouarroudjwissem namedentitydisambiguationinshorttextsoverknowledgegraphs
AT boufaidazizette namedentitydisambiguationinshorttextsoverknowledgegraphs
AT bellatrecheladjel namedentitydisambiguationinshorttextsoverknowledgegraphs