Cargando…

Digitising legacy zoological taxonomic literature: Processes, products and using the output

Abstract. By digitising legacy taxonomic literature using XML mark-up the contents become accessible to other taxonomic and nomenclatural information systems. Appropriate schemas need to be interoperable with other sectorial schemas, atomise to appropriate content elements and carry appropriate meta...

Descripción completa

Detalles Bibliográficos
Autor principal:	Lyal, Christopher H. C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Pensoft Publishers 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4741221/ https://www.ncbi.nlm.nih.gov/pubmed/26877659 http://dx.doi.org/10.3897/zookeys.550.9702

_version_	1782413967542452224
author	Lyal, Christopher H. C.
author_facet	Lyal, Christopher H. C.
author_sort	Lyal, Christopher H. C.
collection	PubMed
description	Abstract. By digitising legacy taxonomic literature using XML mark-up the contents become accessible to other taxonomic and nomenclatural information systems. Appropriate schemas need to be interoperable with other sectorial schemas, atomise to appropriate content elements and carry appropriate metadata to, for example, enable algorithmic assessment of availability of a name under the Code. Legacy (and new) literature delivered in this fashion will become part of a global taxonomic resource from which users can extract tailored content to meet their particular needs, be they nomenclatural, taxonomic, faunistic or other. To date, most digitisation of taxonomic literature has led to a more or less simple digital copy of a paper original – the output of the many efforts has effectively been an electronic copy of a traditional library. While this has increased accessibility of publications through internet access, the means by which many scientific papers are indexed and located is much the same as with traditional libraries. OCR and born-digital papers allow use of web search engines to locate instances of taxon names and other terms, but OCR efficiency in recognising taxonomic names is still relatively poor, people’s ability to use search engines effectively is mixed, and many papers cannot be searched directly. Instead of building digital analogues of traditional publications, we should consider what properties we require of future taxonomic information access. Ideally the content of each new digital publication should be accessible in the context of all previous published data, and the user able to retrieve nomenclatural, taxonomic and other data / information in the form required without having to scan all of the original papers and extract target content manually. This opens the door to dynamic linking of new content with extant systems: automatic population and updating of taxonomic catalogues, ZooBank and faunal lists, all descriptions of a taxon and its children instantly accessible with a single search, comparison of classifications used in different publications, and so on. A means to do this is through marking up content into XML, and the more atomised the mark-up the greater the possibilities for data retrieval and integration. Mark-up requires XML that accommodates the required content elements and is interoperable with other XML schemas, and there are now several written to do this, particularly TaxPub, taxonX and taXMLit, the last of these being the most atomised. We now need to automate this process as far as possible. Manual and automatic data and information retrieval is demonstrated by projects such as INOTAXA and Plazi. As we move to creating and using taxonomic products through the power of the internet, we need to ensure the output, while satisfying in its production the requirements of the Code, is fit for purpose in the future.
format	Online Article Text
id	pubmed-4741221
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Pensoft Publishers
record_format	MEDLINE/PubMed
spelling	pubmed-47412212016-02-12 Digitising legacy zoological taxonomic literature: Processes, products and using the output Lyal, Christopher H. C. Zookeys Research Article Abstract. By digitising legacy taxonomic literature using XML mark-up the contents become accessible to other taxonomic and nomenclatural information systems. Appropriate schemas need to be interoperable with other sectorial schemas, atomise to appropriate content elements and carry appropriate metadata to, for example, enable algorithmic assessment of availability of a name under the Code. Legacy (and new) literature delivered in this fashion will become part of a global taxonomic resource from which users can extract tailored content to meet their particular needs, be they nomenclatural, taxonomic, faunistic or other. To date, most digitisation of taxonomic literature has led to a more or less simple digital copy of a paper original – the output of the many efforts has effectively been an electronic copy of a traditional library. While this has increased accessibility of publications through internet access, the means by which many scientific papers are indexed and located is much the same as with traditional libraries. OCR and born-digital papers allow use of web search engines to locate instances of taxon names and other terms, but OCR efficiency in recognising taxonomic names is still relatively poor, people’s ability to use search engines effectively is mixed, and many papers cannot be searched directly. Instead of building digital analogues of traditional publications, we should consider what properties we require of future taxonomic information access. Ideally the content of each new digital publication should be accessible in the context of all previous published data, and the user able to retrieve nomenclatural, taxonomic and other data / information in the form required without having to scan all of the original papers and extract target content manually. This opens the door to dynamic linking of new content with extant systems: automatic population and updating of taxonomic catalogues, ZooBank and faunal lists, all descriptions of a taxon and its children instantly accessible with a single search, comparison of classifications used in different publications, and so on. A means to do this is through marking up content into XML, and the more atomised the mark-up the greater the possibilities for data retrieval and integration. Mark-up requires XML that accommodates the required content elements and is interoperable with other XML schemas, and there are now several written to do this, particularly TaxPub, taxonX and taXMLit, the last of these being the most atomised. We now need to automate this process as far as possible. Manual and automatic data and information retrieval is demonstrated by projects such as INOTAXA and Plazi. As we move to creating and using taxonomic products through the power of the internet, we need to ensure the output, while satisfying in its production the requirements of the Code, is fit for purpose in the future. Pensoft Publishers 2016-01-07 /pmc/articles/PMC4741221/ /pubmed/26877659 http://dx.doi.org/10.3897/zookeys.550.9702 Text en Christopher H. C. Lyal http://creativecommons.org/licenses/by/4.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Lyal, Christopher H. C. Digitising legacy zoological taxonomic literature: Processes, products and using the output
title	Digitising legacy zoological taxonomic literature: Processes, products and using the output
title_full	Digitising legacy zoological taxonomic literature: Processes, products and using the output
title_fullStr	Digitising legacy zoological taxonomic literature: Processes, products and using the output
title_full_unstemmed	Digitising legacy zoological taxonomic literature: Processes, products and using the output
title_short	Digitising legacy zoological taxonomic literature: Processes, products and using the output
title_sort	digitising legacy zoological taxonomic literature: processes, products and using the output
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4741221/ https://www.ncbi.nlm.nih.gov/pubmed/26877659 http://dx.doi.org/10.3897/zookeys.550.9702
work_keys_str_mv	AT lyalchristopherhc digitisinglegacyzoologicaltaxonomicliteratureprocessesproductsandusingtheoutput

Digitising legacy zoological taxonomic literature: Processes, products and using the output

Ejemplares similares