Cargando…

Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library

BACKGROUND: The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level met...

Descripción completa

Detalles Bibliográficos
Autor principal: Page, Roderic DM
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3129327/
https://www.ncbi.nlm.nih.gov/pubmed/21605356
http://dx.doi.org/10.1186/1471-2105-12-187
_version_ 1782207535385673728
author Page, Roderic DM
author_facet Page, Roderic DM
author_sort Page, Roderic DM
collection PubMed
description BACKGROUND: The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level metadata. Given that the article is the standard unit of citation, this makes it difficult to locate cited literature in BHL. Adding the ability to easily find articles in BHL would greatly enhance the value of the archive. DESCRIPTION: A service was developed to locate articles in BHL based on matching article metadata to BHL metadata using approximate string matching, regular expressions, and string alignment. This article locating service is exposed as a standard OpenURL resolver on the BioStor web site http://biostor.org/openurl/. This resolver can be used on the web, or called by bibliographic tools that support OpenURL. CONCLUSIONS: BioStor provides tools for extracting, annotating, and visualising articles from the Biodiversity Heritage Library. BioStor is available from http://biostor.org/.
format Online
Article
Text
id pubmed-3129327
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31293272011-07-05 Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library Page, Roderic DM BMC Bioinformatics Database BACKGROUND: The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level metadata. Given that the article is the standard unit of citation, this makes it difficult to locate cited literature in BHL. Adding the ability to easily find articles in BHL would greatly enhance the value of the archive. DESCRIPTION: A service was developed to locate articles in BHL based on matching article metadata to BHL metadata using approximate string matching, regular expressions, and string alignment. This article locating service is exposed as a standard OpenURL resolver on the BioStor web site http://biostor.org/openurl/. This resolver can be used on the web, or called by bibliographic tools that support OpenURL. CONCLUSIONS: BioStor provides tools for extracting, annotating, and visualising articles from the Biodiversity Heritage Library. BioStor is available from http://biostor.org/. BioMed Central 2011-05-23 /pmc/articles/PMC3129327/ /pubmed/21605356 http://dx.doi.org/10.1186/1471-2105-12-187 Text en Copyright ©2011 Page; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Page, Roderic DM
Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
title Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
title_full Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
title_fullStr Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
title_full_unstemmed Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
title_short Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
title_sort extracting scientific articles from a large digital archive: biostor and the biodiversity heritage library
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3129327/
https://www.ncbi.nlm.nih.gov/pubmed/21605356
http://dx.doi.org/10.1186/1471-2105-12-187
work_keys_str_mv AT pagerodericdm extractingscientificarticlesfromalargedigitalarchivebiostorandthebiodiversityheritagelibrary