Cargando…

impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers

Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations whic...

Descripción completa

Detalles Bibliográficos
Autores principales: Düring, Marten, Romanello, Matteo, Ehrmann, Maud, Beelen, Kaspar, Guido, Daniele, Deseure, Brecht, Bunout, Estelle, Keck, Jana, Apostolopoulos, Petros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654985/
https://www.ncbi.nlm.nih.gov/pubmed/38025945
http://dx.doi.org/10.3389/fdata.2023.1249469
_version_ 1785136728875466752
author Düring, Marten
Romanello, Matteo
Ehrmann, Maud
Beelen, Kaspar
Guido, Daniele
Deseure, Brecht
Bunout, Estelle
Keck, Jana
Apostolopoulos, Petros
author_facet Düring, Marten
Romanello, Matteo
Ehrmann, Maud
Beelen, Kaspar
Guido, Daniele
Deseure, Brecht
Bunout, Estelle
Keck, Jana
Apostolopoulos, Petros
author_sort Düring, Marten
collection PubMed
description Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.
format Online
Article
Text
id pubmed-10654985
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-106549852023-11-03 impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers Düring, Marten Romanello, Matteo Ehrmann, Maud Beelen, Kaspar Guido, Daniele Deseure, Brecht Bunout, Estelle Keck, Jana Apostolopoulos, Petros Front Big Data Big Data Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation. Frontiers Media S.A. 2023-11-03 /pmc/articles/PMC10654985/ /pubmed/38025945 http://dx.doi.org/10.3389/fdata.2023.1249469 Text en Copyright © 2023 Düring, Romanello, Ehrmann, Beelen, Guido, Deseure, Bunout, Keck and Apostolopoulos. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Düring, Marten
Romanello, Matteo
Ehrmann, Maud
Beelen, Kaspar
Guido, Daniele
Deseure, Brecht
Bunout, Estelle
Keck, Jana
Apostolopoulos, Petros
impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
title impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
title_full impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
title_fullStr impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
title_full_unstemmed impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
title_short impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
title_sort impresso text reuse at scale. an interface for the exploration of text reuse data in semantically enriched historical newspapers
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654985/
https://www.ncbi.nlm.nih.gov/pubmed/38025945
http://dx.doi.org/10.3389/fdata.2023.1249469
work_keys_str_mv AT duringmarten impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT romanellomatteo impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT ehrmannmaud impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT beelenkaspar impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT guidodaniele impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT deseurebrecht impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT bunoutestelle impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT keckjana impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers
AT apostolopoulospetros impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers