Cargando…

An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation

BACKGROUND: For years, Uniform Resource Locator (URL) decay or "link rot" has been a growing concern in the field of biomedical sciences. This paper addresses this issue by examining the status of the URLs published in MEDLINE abstracts, establishing current availability and estimating URL...

Descripción completa

Detalles Bibliográficos
Autores principales: Ducut, Erick, Liu, Fang, Fontelo, Paul
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2435527/
https://www.ncbi.nlm.nih.gov/pubmed/18547428
http://dx.doi.org/10.1186/1472-6947-8-23
_version_ 1782156486487572480
author Ducut, Erick
Liu, Fang
Fontelo, Paul
author_facet Ducut, Erick
Liu, Fang
Fontelo, Paul
author_sort Ducut, Erick
collection PubMed
description BACKGROUND: For years, Uniform Resource Locator (URL) decay or "link rot" has been a growing concern in the field of biomedical sciences. This paper addresses this issue by examining the status of the URLs published in MEDLINE abstracts, establishing current availability and estimating URL decay in these records from 1994 to 2006. We also reviewed the information provided by the URL to determine if the context that the author cited in writing the paper is the same information presently available in the URL. Lastly, with all the documented recommended methods to preserve URL links, we determined which among them have gained acceptance among authors and publishers. METHODS: MEDLINE records from 1994 to 2006 from the National Library of Medicine in Extensible Mark-up Language (XML) format were processed yielding 10,208 URL addresses. These were accessed once daily at random times for 30 days. Titles and abstracts were also searched for the presence of archival tools such as WebCite, Persistent URL (PURL) and Digital Object Identifier (DOI). RESULTS: Results showed that the average URL length ranged from 13 to 425 characters with a mean length of 35 characters [Standard Deviation (SD) = 13.51; 95% confidence interval (CI) 13.25 to 13.77]. The most common top-level domains were ".org" and ".edu", each with 34%. About 81% of the URL pool was available 90% to 100% of the time, but only 78% of these contained the actual information mentioned in the MEDLINE record. "Dead" URLs constituted 16% of the total. Finally, a survey of archival tool usage showed that since its introduction in 1998, only 519 of all abstracts reviewed had incorporated DOI addresses in their MEDLINE abstracts. CONCLUSION: URL persistence parallels previous studies which showed approximately 81% general availability during the 1-month study period. As peer-reviewed literature remains to be the main source of information in biomedicine, we need to ensure the accuracy and preservation of these links.
format Text
id pubmed-2435527
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24355272008-06-24 An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation Ducut, Erick Liu, Fang Fontelo, Paul BMC Med Inform Decis Mak Research Article BACKGROUND: For years, Uniform Resource Locator (URL) decay or "link rot" has been a growing concern in the field of biomedical sciences. This paper addresses this issue by examining the status of the URLs published in MEDLINE abstracts, establishing current availability and estimating URL decay in these records from 1994 to 2006. We also reviewed the information provided by the URL to determine if the context that the author cited in writing the paper is the same information presently available in the URL. Lastly, with all the documented recommended methods to preserve URL links, we determined which among them have gained acceptance among authors and publishers. METHODS: MEDLINE records from 1994 to 2006 from the National Library of Medicine in Extensible Mark-up Language (XML) format were processed yielding 10,208 URL addresses. These were accessed once daily at random times for 30 days. Titles and abstracts were also searched for the presence of archival tools such as WebCite, Persistent URL (PURL) and Digital Object Identifier (DOI). RESULTS: Results showed that the average URL length ranged from 13 to 425 characters with a mean length of 35 characters [Standard Deviation (SD) = 13.51; 95% confidence interval (CI) 13.25 to 13.77]. The most common top-level domains were ".org" and ".edu", each with 34%. About 81% of the URL pool was available 90% to 100% of the time, but only 78% of these contained the actual information mentioned in the MEDLINE record. "Dead" URLs constituted 16% of the total. Finally, a survey of archival tool usage showed that since its introduction in 1998, only 519 of all abstracts reviewed had incorporated DOI addresses in their MEDLINE abstracts. CONCLUSION: URL persistence parallels previous studies which showed approximately 81% general availability during the 1-month study period. As peer-reviewed literature remains to be the main source of information in biomedicine, we need to ensure the accuracy and preservation of these links. BioMed Central 2008-06-11 /pmc/articles/PMC2435527/ /pubmed/18547428 http://dx.doi.org/10.1186/1472-6947-8-23 Text en Copyright © 2008 Ducut et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ducut, Erick
Liu, Fang
Fontelo, Paul
An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_full An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_fullStr An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_full_unstemmed An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_short An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_sort update on uniform resource locator (url) decay in medline abstracts and measures for its mitigation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2435527/
https://www.ncbi.nlm.nih.gov/pubmed/18547428
http://dx.doi.org/10.1186/1472-6947-8-23
work_keys_str_mv AT ducuterick anupdateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT liufang anupdateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT fontelopaul anupdateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT ducuterick updateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT liufang updateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT fontelopaul updateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation