Cargando…

EventDNA: a dataset for Dutch news event extraction as a basis for news diversification

News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper,...

Descripción completa

Detalles Bibliográficos
Autores principales: Colruyt, Camiel, De Clercq, Orphée, Desot, Thierry, Hoste, Véronique
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9672586/
https://www.ncbi.nlm.nih.gov/pubmed/36415480
http://dx.doi.org/10.1007/s10579-022-09623-2
_version_ 1784832769191313408
author Colruyt, Camiel
De Clercq, Orphée
Desot, Thierry
Hoste, Véronique
author_facet Colruyt, Camiel
De Clercq, Orphée
Desot, Thierry
Hoste, Véronique
author_sort Colruyt, Camiel
collection PubMed
description News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper, we introduce the EventDNA corpus, a dataset of 1773 Dutch-language news articles annotated with information on entities, news events and IPTC Media Topic codes, with the ultimate goal to outline a recommendation algorithm that uses news event diversity rather than previous reading behaviour as a key driver for personalized news recommendation. We describe the EventDNA annotation guidelines, which are inspired by the well-known ERE framework and conclude that it is not practical to apply a fixed event typology such as used in ERE to an unrestricted data context. The corpus and related source code is made available at https://github.com/NewsDNA-LT3/.github.
format Online
Article
Text
id pubmed-9672586
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-96725862022-11-18 EventDNA: a dataset for Dutch news event extraction as a basis for news diversification Colruyt, Camiel De Clercq, Orphée Desot, Thierry Hoste, Véronique Lang Resour Eval Original Paper News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper, we introduce the EventDNA corpus, a dataset of 1773 Dutch-language news articles annotated with information on entities, news events and IPTC Media Topic codes, with the ultimate goal to outline a recommendation algorithm that uses news event diversity rather than previous reading behaviour as a key driver for personalized news recommendation. We describe the EventDNA annotation guidelines, which are inspired by the well-known ERE framework and conclude that it is not practical to apply a fixed event typology such as used in ERE to an unrestricted data context. The corpus and related source code is made available at https://github.com/NewsDNA-LT3/.github. Springer Netherlands 2022-11-17 2023 /pmc/articles/PMC9672586/ /pubmed/36415480 http://dx.doi.org/10.1007/s10579-022-09623-2 Text en © The Author(s), under exclusive licence to Springer Nature B.V. 2022, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Paper
Colruyt, Camiel
De Clercq, Orphée
Desot, Thierry
Hoste, Véronique
EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
title EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
title_full EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
title_fullStr EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
title_full_unstemmed EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
title_short EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
title_sort eventdna: a dataset for dutch news event extraction as a basis for news diversification
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9672586/
https://www.ncbi.nlm.nih.gov/pubmed/36415480
http://dx.doi.org/10.1007/s10579-022-09623-2
work_keys_str_mv AT colruytcamiel eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification
AT declercqorphee eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification
AT desotthierry eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification
AT hosteveronique eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification