Cargando…
EventDNA: a dataset for Dutch news event extraction as a basis for news diversification
News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Netherlands
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9672586/ https://www.ncbi.nlm.nih.gov/pubmed/36415480 http://dx.doi.org/10.1007/s10579-022-09623-2 |
_version_ | 1784832769191313408 |
---|---|
author | Colruyt, Camiel De Clercq, Orphée Desot, Thierry Hoste, Véronique |
author_facet | Colruyt, Camiel De Clercq, Orphée Desot, Thierry Hoste, Véronique |
author_sort | Colruyt, Camiel |
collection | PubMed |
description | News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper, we introduce the EventDNA corpus, a dataset of 1773 Dutch-language news articles annotated with information on entities, news events and IPTC Media Topic codes, with the ultimate goal to outline a recommendation algorithm that uses news event diversity rather than previous reading behaviour as a key driver for personalized news recommendation. We describe the EventDNA annotation guidelines, which are inspired by the well-known ERE framework and conclude that it is not practical to apply a fixed event typology such as used in ERE to an unrestricted data context. The corpus and related source code is made available at https://github.com/NewsDNA-LT3/.github. |
format | Online Article Text |
id | pubmed-9672586 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Netherlands |
record_format | MEDLINE/PubMed |
spelling | pubmed-96725862022-11-18 EventDNA: a dataset for Dutch news event extraction as a basis for news diversification Colruyt, Camiel De Clercq, Orphée Desot, Thierry Hoste, Véronique Lang Resour Eval Original Paper News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper, we introduce the EventDNA corpus, a dataset of 1773 Dutch-language news articles annotated with information on entities, news events and IPTC Media Topic codes, with the ultimate goal to outline a recommendation algorithm that uses news event diversity rather than previous reading behaviour as a key driver for personalized news recommendation. We describe the EventDNA annotation guidelines, which are inspired by the well-known ERE framework and conclude that it is not practical to apply a fixed event typology such as used in ERE to an unrestricted data context. The corpus and related source code is made available at https://github.com/NewsDNA-LT3/.github. Springer Netherlands 2022-11-17 2023 /pmc/articles/PMC9672586/ /pubmed/36415480 http://dx.doi.org/10.1007/s10579-022-09623-2 Text en © The Author(s), under exclusive licence to Springer Nature B.V. 2022, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Paper Colruyt, Camiel De Clercq, Orphée Desot, Thierry Hoste, Véronique EventDNA: a dataset for Dutch news event extraction as a basis for news diversification |
title | EventDNA: a dataset for Dutch news event extraction as a basis for news diversification |
title_full | EventDNA: a dataset for Dutch news event extraction as a basis for news diversification |
title_fullStr | EventDNA: a dataset for Dutch news event extraction as a basis for news diversification |
title_full_unstemmed | EventDNA: a dataset for Dutch news event extraction as a basis for news diversification |
title_short | EventDNA: a dataset for Dutch news event extraction as a basis for news diversification |
title_sort | eventdna: a dataset for dutch news event extraction as a basis for news diversification |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9672586/ https://www.ncbi.nlm.nih.gov/pubmed/36415480 http://dx.doi.org/10.1007/s10579-022-09623-2 |
work_keys_str_mv | AT colruytcamiel eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification AT declercqorphee eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification AT desotthierry eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification AT hosteveronique eventdnaadatasetfordutchnewseventextractionasabasisfornewsdiversification |