Cargando…

Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot

The emergence of the web has fundamentally affected most aspects of information communication, including scholarly communication. The immediacy that characterizes publishing information to the web, as well as accessing it, allows for a dramatic increase in the speed of dissemination of scholarly kno...

Descripción completa

Detalles Bibliográficos
Autores principales: Klein, Martin, Van de Sompel, Herbert, Sanderson, Robert, Shankar, Harihar, Balakireva, Lyudmila, Zhou, Ke, Tobin, Richard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4277367/
https://www.ncbi.nlm.nih.gov/pubmed/25541969
http://dx.doi.org/10.1371/journal.pone.0115253
_version_ 1782350386343968768
author Klein, Martin
Van de Sompel, Herbert
Sanderson, Robert
Shankar, Harihar
Balakireva, Lyudmila
Zhou, Ke
Tobin, Richard
author_facet Klein, Martin
Van de Sompel, Herbert
Sanderson, Robert
Shankar, Harihar
Balakireva, Lyudmila
Zhou, Ke
Tobin, Richard
author_sort Klein, Martin
collection PubMed
description The emergence of the web has fundamentally affected most aspects of information communication, including scholarly communication. The immediacy that characterizes publishing information to the web, as well as accessing it, allows for a dramatic increase in the speed of dissemination of scholarly knowledge. But, the transition from a paper-based to a web-based scholarly communication system also poses challenges. In this paper, we focus on reference rot, the combination of link rot and content drift to which references to web resources included in Science, Technology, and Medicine (STM) articles are subject. We investigate the extent to which reference rot impacts the ability to revisit the web context that surrounds STM articles some time after their publication. We do so on the basis of a vast collection of articles from three corpora that span publication years 1997 to 2012. For over one million references to web resources extracted from over 3.5 million articles, we determine whether the HTTP URI is still responsive on the live web and whether web archives contain an archived snapshot representative of the state the referenced resource had at the time it was referenced. We observe that the fraction of articles containing references to web resources is growing steadily over time. We find one out of five STM articles suffering from reference rot, meaning it is impossible to revisit the web context that surrounds them some time after their publication. When only considering STM articles that contain references to web resources, this fraction increases to seven out of ten. We suggest that, in order to safeguard the long-term integrity of the web-based scholarly record, robust solutions to combat the reference rot problem are required. In conclusion, we provide a brief insight into the directions that are explored with this regard in the context of the Hiberlink project.
format Online
Article
Text
id pubmed-4277367
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42773672014-12-31 Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot Klein, Martin Van de Sompel, Herbert Sanderson, Robert Shankar, Harihar Balakireva, Lyudmila Zhou, Ke Tobin, Richard PLoS One Research Article The emergence of the web has fundamentally affected most aspects of information communication, including scholarly communication. The immediacy that characterizes publishing information to the web, as well as accessing it, allows for a dramatic increase in the speed of dissemination of scholarly knowledge. But, the transition from a paper-based to a web-based scholarly communication system also poses challenges. In this paper, we focus on reference rot, the combination of link rot and content drift to which references to web resources included in Science, Technology, and Medicine (STM) articles are subject. We investigate the extent to which reference rot impacts the ability to revisit the web context that surrounds STM articles some time after their publication. We do so on the basis of a vast collection of articles from three corpora that span publication years 1997 to 2012. For over one million references to web resources extracted from over 3.5 million articles, we determine whether the HTTP URI is still responsive on the live web and whether web archives contain an archived snapshot representative of the state the referenced resource had at the time it was referenced. We observe that the fraction of articles containing references to web resources is growing steadily over time. We find one out of five STM articles suffering from reference rot, meaning it is impossible to revisit the web context that surrounds them some time after their publication. When only considering STM articles that contain references to web resources, this fraction increases to seven out of ten. We suggest that, in order to safeguard the long-term integrity of the web-based scholarly record, robust solutions to combat the reference rot problem are required. In conclusion, we provide a brief insight into the directions that are explored with this regard in the context of the Hiberlink project. Public Library of Science 2014-12-26 /pmc/articles/PMC4277367/ /pubmed/25541969 http://dx.doi.org/10.1371/journal.pone.0115253 Text en © 2014 Klein et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Klein, Martin
Van de Sompel, Herbert
Sanderson, Robert
Shankar, Harihar
Balakireva, Lyudmila
Zhou, Ke
Tobin, Richard
Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot
title Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot
title_full Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot
title_fullStr Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot
title_full_unstemmed Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot
title_short Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot
title_sort scholarly context not found: one in five articles suffers from reference rot
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4277367/
https://www.ncbi.nlm.nih.gov/pubmed/25541969
http://dx.doi.org/10.1371/journal.pone.0115253
work_keys_str_mv AT kleinmartin scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot
AT vandesompelherbert scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot
AT sandersonrobert scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot
AT shankarharihar scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot
AT balakirevalyudmila scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot
AT zhouke scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot
AT tobinrichard scholarlycontextnotfoundoneinfivearticlessuffersfromreferencerot