Cargando…
impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers
Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations whic...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654985/ https://www.ncbi.nlm.nih.gov/pubmed/38025945 http://dx.doi.org/10.3389/fdata.2023.1249469 |
_version_ | 1785136728875466752 |
---|---|
author | Düring, Marten Romanello, Matteo Ehrmann, Maud Beelen, Kaspar Guido, Daniele Deseure, Brecht Bunout, Estelle Keck, Jana Apostolopoulos, Petros |
author_facet | Düring, Marten Romanello, Matteo Ehrmann, Maud Beelen, Kaspar Guido, Daniele Deseure, Brecht Bunout, Estelle Keck, Jana Apostolopoulos, Petros |
author_sort | Düring, Marten |
collection | PubMed |
description | Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation. |
format | Online Article Text |
id | pubmed-10654985 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-106549852023-11-03 impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers Düring, Marten Romanello, Matteo Ehrmann, Maud Beelen, Kaspar Guido, Daniele Deseure, Brecht Bunout, Estelle Keck, Jana Apostolopoulos, Petros Front Big Data Big Data Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation. Frontiers Media S.A. 2023-11-03 /pmc/articles/PMC10654985/ /pubmed/38025945 http://dx.doi.org/10.3389/fdata.2023.1249469 Text en Copyright © 2023 Düring, Romanello, Ehrmann, Beelen, Guido, Deseure, Bunout, Keck and Apostolopoulos. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Big Data Düring, Marten Romanello, Matteo Ehrmann, Maud Beelen, Kaspar Guido, Daniele Deseure, Brecht Bunout, Estelle Keck, Jana Apostolopoulos, Petros impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers |
title | impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers |
title_full | impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers |
title_fullStr | impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers |
title_full_unstemmed | impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers |
title_short | impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers |
title_sort | impresso text reuse at scale. an interface for the exploration of text reuse data in semantically enriched historical newspapers |
topic | Big Data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654985/ https://www.ncbi.nlm.nih.gov/pubmed/38025945 http://dx.doi.org/10.3389/fdata.2023.1249469 |
work_keys_str_mv | AT duringmarten impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT romanellomatteo impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT ehrmannmaud impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT beelenkaspar impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT guidodaniele impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT deseurebrecht impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT bunoutestelle impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT keckjana impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers AT apostolopoulospetros impressotextreuseatscaleaninterfacefortheexplorationoftextreusedatainsemanticallyenrichedhistoricalnewspapers |