Cargando…

A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics

Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different typ...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gritta, Milan, Pilehvar, Mohammad Taher, Collier, Nigel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Netherlands 2019
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7406539/ https://www.ncbi.nlm.nih.gov/pubmed/32802011 http://dx.doi.org/10.1007/s10579-019-09475-3

_version_	1783567445239267328
author	Gritta, Milan Pilehvar, Mohammad Taher Collier, Nigel
author_facet	Gritta, Milan Pilehvar, Mohammad Taher Collier, Nigel
author_sort	Gritta, Milan
collection	PubMed
description	Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models.
format	Online Article Text
id	pubmed-7406539
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Springer Netherlands
record_format	MEDLINE/PubMed
spelling	pubmed-74065392020-08-13 A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics Gritta, Milan Pilehvar, Mohammad Taher Collier, Nigel Lang Resour Eval Original Paper Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models. Springer Netherlands 2019-09-19 2020 /pmc/articles/PMC7406539/ /pubmed/32802011 http://dx.doi.org/10.1007/s10579-019-09475-3 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Original Paper Gritta, Milan Pilehvar, Mohammad Taher Collier, Nigel A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
title	A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
title_full	A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
title_fullStr	A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
title_full_unstemmed	A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
title_short	A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
title_sort	pragmatic guide to geoparsing evaluation: toponyms, named entity recognition and pragmatics
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7406539/ https://www.ncbi.nlm.nih.gov/pubmed/32802011 http://dx.doi.org/10.1007/s10579-019-09475-3
work_keys_str_mv	AT grittamilan apragmaticguidetogeoparsingevaluationtoponymsnamedentityrecognitionandpragmatics AT pilehvarmohammadtaher apragmaticguidetogeoparsingevaluationtoponymsnamedentityrecognitionandpragmatics AT colliernigel apragmaticguidetogeoparsingevaluationtoponymsnamedentityrecognitionandpragmatics AT grittamilan pragmaticguidetogeoparsingevaluationtoponymsnamedentityrecognitionandpragmatics AT pilehvarmohammadtaher pragmaticguidetogeoparsingevaluationtoponymsnamedentityrecognitionandpragmatics AT colliernigel pragmaticguidetogeoparsingevaluationtoponymsnamedentityrecognitionandpragmatics

A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics

Ejemplares similares