Cargando…

The use and limits of scientific names in biological informatics

Abstract. Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological inform...

Descripción completa

Detalles Bibliográficos
Autor principal: Remsen, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Pensoft Publishers 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4741222/
https://www.ncbi.nlm.nih.gov/pubmed/26877660
http://dx.doi.org/10.3897/zookeys.550.9546
_version_ 1782413967772090368
author Remsen, David
author_facet Remsen, David
author_sort Remsen, David
collection PubMed
description Abstract. Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription. These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.
format Online
Article
Text
id pubmed-4741222
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Pensoft Publishers
record_format MEDLINE/PubMed
spelling pubmed-47412222016-02-12 The use and limits of scientific names in biological informatics Remsen, David Zookeys Research Article Abstract. Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription. These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names. Pensoft Publishers 2016-01-07 /pmc/articles/PMC4741222/ /pubmed/26877660 http://dx.doi.org/10.3897/zookeys.550.9546 Text en David Remsen http://creativecommons.org/licenses/by/4.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Remsen, David
The use and limits of scientific names in biological informatics
title The use and limits of scientific names in biological informatics
title_full The use and limits of scientific names in biological informatics
title_fullStr The use and limits of scientific names in biological informatics
title_full_unstemmed The use and limits of scientific names in biological informatics
title_short The use and limits of scientific names in biological informatics
title_sort use and limits of scientific names in biological informatics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4741222/
https://www.ncbi.nlm.nih.gov/pubmed/26877660
http://dx.doi.org/10.3897/zookeys.550.9546
work_keys_str_mv AT remsendavid theuseandlimitsofscientificnamesinbiologicalinformatics
AT remsendavid useandlimitsofscientificnamesinbiologicalinformatics