Cargando…
TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus
Medieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Netherlands
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550670/ https://www.ncbi.nlm.nih.gov/pubmed/34776810 http://dx.doi.org/10.1007/s10579-020-09516-2 |
_version_ | 1784591004795404288 |
---|---|
author | Álvarez-Mellado, Elena Díez-Platas, María Luisa Ruiz-Fabo, Pablo Bermúdez, Helena Ros, Salvador González-Blanco, Elena |
author_facet | Álvarez-Mellado, Elena Díez-Platas, María Luisa Ruiz-Fabo, Pablo Bermúdez, Helena Ros, Salvador González-Blanco, Elena |
author_sort | Álvarez-Mellado, Elena |
collection | PubMed |
description | Medieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and the general-domain NER annotation schemes fail to capture the nature of medieval entities. In this paper we explore the challenges of performing named-entity annotation on a corpus of Spanish medieval documents: we discuss the mismatches that arise when applying traditional NER categories to a corpus of Spanish medieval documents and we propose a novel humanist-friendly TEI-compliant annotation scheme and guidelines intended to capture the particular nature of medieval entities. |
format | Online Article Text |
id | pubmed-8550670 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer Netherlands |
record_format | MEDLINE/PubMed |
spelling | pubmed-85506702021-11-10 TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus Álvarez-Mellado, Elena Díez-Platas, María Luisa Ruiz-Fabo, Pablo Bermúdez, Helena Ros, Salvador González-Blanco, Elena Lang Resour Eval Project Notes Medieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and the general-domain NER annotation schemes fail to capture the nature of medieval entities. In this paper we explore the challenges of performing named-entity annotation on a corpus of Spanish medieval documents: we discuss the mismatches that arise when applying traditional NER categories to a corpus of Spanish medieval documents and we propose a novel humanist-friendly TEI-compliant annotation scheme and guidelines intended to capture the particular nature of medieval entities. Springer Netherlands 2021-02-27 2021 /pmc/articles/PMC8550670/ /pubmed/34776810 http://dx.doi.org/10.1007/s10579-020-09516-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Project Notes Álvarez-Mellado, Elena Díez-Platas, María Luisa Ruiz-Fabo, Pablo Bermúdez, Helena Ros, Salvador González-Blanco, Elena TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus |
title | TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus |
title_full | TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus |
title_fullStr | TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus |
title_full_unstemmed | TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus |
title_short | TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus |
title_sort | tei-friendly annotation scheme for medieval named entities: a case on a spanish medieval corpus |
topic | Project Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550670/ https://www.ncbi.nlm.nih.gov/pubmed/34776810 http://dx.doi.org/10.1007/s10579-020-09516-2 |
work_keys_str_mv | AT alvarezmelladoelena teifriendlyannotationschemeformedievalnamedentitiesacaseonaspanishmedievalcorpus AT diezplatasmarialuisa teifriendlyannotationschemeformedievalnamedentitiesacaseonaspanishmedievalcorpus AT ruizfabopablo teifriendlyannotationschemeformedievalnamedentitiesacaseonaspanishmedievalcorpus AT bermudezhelena teifriendlyannotationschemeformedievalnamedentitiesacaseonaspanishmedievalcorpus AT rossalvador teifriendlyannotationschemeformedievalnamedentitiesacaseonaspanishmedievalcorpus AT gonzalezblancoelena teifriendlyannotationschemeformedievalnamedentitiesacaseonaspanishmedievalcorpus |