Cargando…

Learning Advanced Similarities and Training Features for Toponym Interlinking

Interlinking of spatio-textual entities is an open and quite challenging research problem, with application in several commercial fields, including geomarketing, navigation and social networks. It comprises the process of identifying, between different data sources, entity descriptions that refer to...

Descripción completa

Detalles Bibliográficos
Autores principales:	Giannopoulos, Giorgos, Kaffes, Vassilis, Kostoulas, Georgios
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148233/ http://dx.doi.org/10.1007/978-3-030-45439-5_8

_version_	1783520549587124224
author	Giannopoulos, Giorgos Kaffes, Vassilis Kostoulas, Georgios
author_facet	Giannopoulos, Giorgos Kaffes, Vassilis Kostoulas, Georgios
author_sort	Giannopoulos, Giorgos
collection	PubMed
description	Interlinking of spatio-textual entities is an open and quite challenging research problem, with application in several commercial fields, including geomarketing, navigation and social networks. It comprises the process of identifying, between different data sources, entity descriptions that refer to the same real-world entity. In this work, we focus on toponym interlinking, that is we handle spatio-textual entities that are exclusively represented by their name; additional properties, such as categories, coordinates, etc. are considered as either absent or of too low quality to be exploited in this setting. Toponyms are inherently heterogeneous entities; quite often several alternative names exist for the same toponym, with varying degrees of similarity between these names. State of the art approaches adopt mostly generic, domain-agnostic similarity functions and use them as is, or incorporate them as training features within classifiers for performing toponym interlinking. We claim that capturing the specificities of toponyms and exploiting them into elaborate meta-similarity functions and derived training features can significantly increase the effectiveness of interlinking methods. To this end, we propose the LGM-Sim meta-similarity function and a series of novel, similarity-based and statistical training features that can be utilized in similarity-based and classification-based interlinking settings respectively. We demonstrate that the proposed methods achieve large increases in accuracy, in both settings, compared to several methods from the literature in the widely used Geonames toponym dataset.
format	Online Article Text
id	pubmed-7148233
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-71482332020-04-13 Learning Advanced Similarities and Training Features for Toponym Interlinking Giannopoulos, Giorgos Kaffes, Vassilis Kostoulas, Georgios Advances in Information Retrieval Article Interlinking of spatio-textual entities is an open and quite challenging research problem, with application in several commercial fields, including geomarketing, navigation and social networks. It comprises the process of identifying, between different data sources, entity descriptions that refer to the same real-world entity. In this work, we focus on toponym interlinking, that is we handle spatio-textual entities that are exclusively represented by their name; additional properties, such as categories, coordinates, etc. are considered as either absent or of too low quality to be exploited in this setting. Toponyms are inherently heterogeneous entities; quite often several alternative names exist for the same toponym, with varying degrees of similarity between these names. State of the art approaches adopt mostly generic, domain-agnostic similarity functions and use them as is, or incorporate them as training features within classifiers for performing toponym interlinking. We claim that capturing the specificities of toponyms and exploiting them into elaborate meta-similarity functions and derived training features can significantly increase the effectiveness of interlinking methods. To this end, we propose the LGM-Sim meta-similarity function and a series of novel, similarity-based and statistical training features that can be utilized in similarity-based and classification-based interlinking settings respectively. We demonstrate that the proposed methods achieve large increases in accuracy, in both settings, compared to several methods from the literature in the widely used Geonames toponym dataset. 2020-03-17 /pmc/articles/PMC7148233/ http://dx.doi.org/10.1007/978-3-030-45439-5_8 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Giannopoulos, Giorgos Kaffes, Vassilis Kostoulas, Georgios Learning Advanced Similarities and Training Features for Toponym Interlinking
title	Learning Advanced Similarities and Training Features for Toponym Interlinking
title_full	Learning Advanced Similarities and Training Features for Toponym Interlinking
title_fullStr	Learning Advanced Similarities and Training Features for Toponym Interlinking
title_full_unstemmed	Learning Advanced Similarities and Training Features for Toponym Interlinking
title_short	Learning Advanced Similarities and Training Features for Toponym Interlinking
title_sort	learning advanced similarities and training features for toponym interlinking
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148233/ http://dx.doi.org/10.1007/978-3-030-45439-5_8
work_keys_str_mv	AT giannopoulosgiorgos learningadvancedsimilaritiesandtrainingfeaturesfortoponyminterlinking AT kaffesvassilis learningadvancedsimilaritiesandtrainingfeaturesfortoponyminterlinking AT kostoulasgeorgios learningadvancedsimilaritiesandtrainingfeaturesfortoponyminterlinking

Learning Advanced Similarities and Training Features for Toponym Interlinking

Ejemplares similares