Cargando…

A cross disciplinary study of link decay and the effectiveness of mitigation techniques

BACKGROUND: The dynamic, decentralized world-wide-web has become an essential part of scientific research and communication. Researchers create thousands of web sites every year to share software, data and services. These valuable resources tend to disappear over time. The problem has been documente...

Descripción completa

Detalles Bibliográficos
Autores principales: Hennessey, Jason, Ge, Steven Xijin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851533/
https://www.ncbi.nlm.nih.gov/pubmed/24266891
http://dx.doi.org/10.1186/1471-2105-14-S14-S5
_version_ 1782294299581349888
author Hennessey, Jason
Ge, Steven Xijin
author_facet Hennessey, Jason
Ge, Steven Xijin
author_sort Hennessey, Jason
collection PubMed
description BACKGROUND: The dynamic, decentralized world-wide-web has become an essential part of scientific research and communication. Researchers create thousands of web sites every year to share software, data and services. These valuable resources tend to disappear over time. The problem has been documented in many subject areas. Our goal is to conduct a cross-disciplinary investigation of the problem and test the effectiveness of existing remedies. RESULTS: We accessed 14,489 unique web pages found in the abstracts within Thomson Reuters' Web of Science citation index that were published between 1996 and 2010 and found that the median lifespan of these web pages was 9.3 years with 62% of them being archived. Survival analysis and logistic regression were used to find significant predictors of URL lifespan. The availability of a web page is most dependent on the time it is published and the top-level domain names. Similar statistical analysis revealed biases in current solutions: the Internet Archive favors web pages with fewer layers in the Universal Resource Locator (URL) while WebCite is significantly influenced by the source of publication. We also created a prototype for a process to submit web pages to the archives and increased coverage of our list of scientific webpages in the Internet Archive and WebCite by 22% and 255%, respectively. CONCLUSION: Our results show that link decay continues to be a problem across different disciplines and that current solutions for static web pages are helping and can be improved.
format Online
Article
Text
id pubmed-3851533
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38515332013-12-20 A cross disciplinary study of link decay and the effectiveness of mitigation techniques Hennessey, Jason Ge, Steven Xijin BMC Bioinformatics Proceedings BACKGROUND: The dynamic, decentralized world-wide-web has become an essential part of scientific research and communication. Researchers create thousands of web sites every year to share software, data and services. These valuable resources tend to disappear over time. The problem has been documented in many subject areas. Our goal is to conduct a cross-disciplinary investigation of the problem and test the effectiveness of existing remedies. RESULTS: We accessed 14,489 unique web pages found in the abstracts within Thomson Reuters' Web of Science citation index that were published between 1996 and 2010 and found that the median lifespan of these web pages was 9.3 years with 62% of them being archived. Survival analysis and logistic regression were used to find significant predictors of URL lifespan. The availability of a web page is most dependent on the time it is published and the top-level domain names. Similar statistical analysis revealed biases in current solutions: the Internet Archive favors web pages with fewer layers in the Universal Resource Locator (URL) while WebCite is significantly influenced by the source of publication. We also created a prototype for a process to submit web pages to the archives and increased coverage of our list of scientific webpages in the Internet Archive and WebCite by 22% and 255%, respectively. CONCLUSION: Our results show that link decay continues to be a problem across different disciplines and that current solutions for static web pages are helping and can be improved. BioMed Central 2013-10-09 /pmc/articles/PMC3851533/ /pubmed/24266891 http://dx.doi.org/10.1186/1471-2105-14-S14-S5 Text en Copyright © 2013 Hennessey and Ge; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Hennessey, Jason
Ge, Steven Xijin
A cross disciplinary study of link decay and the effectiveness of mitigation techniques
title A cross disciplinary study of link decay and the effectiveness of mitigation techniques
title_full A cross disciplinary study of link decay and the effectiveness of mitigation techniques
title_fullStr A cross disciplinary study of link decay and the effectiveness of mitigation techniques
title_full_unstemmed A cross disciplinary study of link decay and the effectiveness of mitigation techniques
title_short A cross disciplinary study of link decay and the effectiveness of mitigation techniques
title_sort cross disciplinary study of link decay and the effectiveness of mitigation techniques
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851533/
https://www.ncbi.nlm.nih.gov/pubmed/24266891
http://dx.doi.org/10.1186/1471-2105-14-S14-S5
work_keys_str_mv AT hennesseyjason acrossdisciplinarystudyoflinkdecayandtheeffectivenessofmitigationtechniques
AT gestevenxijin acrossdisciplinarystudyoflinkdecayandtheeffectivenessofmitigationtechniques
AT hennesseyjason crossdisciplinarystudyoflinkdecayandtheeffectivenessofmitigationtechniques
AT gestevenxijin crossdisciplinarystudyoflinkdecayandtheeffectivenessofmitigationtechniques