Cargando…
Making species checklists understandable to machines – a shift from relational databases to ontologies
BACKGROUND: The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may sh...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4417522/ https://www.ncbi.nlm.nih.gov/pubmed/25937880 http://dx.doi.org/10.1186/2041-1480-5-40 |
_version_ | 1782369370390921216 |
---|---|
author | Laurenne, Nina Tuominen, Jouni Saarenmaa, Hannu Hyvönen, Eero |
author_facet | Laurenne, Nina Tuominen, Jouni Saarenmaa, Hannu Hyvönen, Eero |
author_sort | Laurenne, Nina |
collection | PubMed |
description | BACKGROUND: The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications. RESULTS: We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time. CONCLUSIONS: The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more “intelligent” biological applications and services. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2041-1480-5-40) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4417522 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44175222015-05-04 Making species checklists understandable to machines – a shift from relational databases to ontologies Laurenne, Nina Tuominen, Jouni Saarenmaa, Hannu Hyvönen, Eero J Biomed Semantics Research BACKGROUND: The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications. RESULTS: We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time. CONCLUSIONS: The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more “intelligent” biological applications and services. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2041-1480-5-40) contains supplementary material, which is available to authorized users. BioMed Central 2014-09-08 /pmc/articles/PMC4417522/ /pubmed/25937880 http://dx.doi.org/10.1186/2041-1480-5-40 Text en © Laurenne et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Laurenne, Nina Tuominen, Jouni Saarenmaa, Hannu Hyvönen, Eero Making species checklists understandable to machines – a shift from relational databases to ontologies |
title | Making species checklists understandable to machines – a shift from relational databases to ontologies |
title_full | Making species checklists understandable to machines – a shift from relational databases to ontologies |
title_fullStr | Making species checklists understandable to machines – a shift from relational databases to ontologies |
title_full_unstemmed | Making species checklists understandable to machines – a shift from relational databases to ontologies |
title_short | Making species checklists understandable to machines – a shift from relational databases to ontologies |
title_sort | making species checklists understandable to machines – a shift from relational databases to ontologies |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4417522/ https://www.ncbi.nlm.nih.gov/pubmed/25937880 http://dx.doi.org/10.1186/2041-1480-5-40 |
work_keys_str_mv | AT laurennenina makingspecieschecklistsunderstandabletomachinesashiftfromrelationaldatabasestoontologies AT tuominenjouni makingspecieschecklistsunderstandabletomachinesashiftfromrelationaldatabasestoontologies AT saarenmaahannu makingspecieschecklistsunderstandabletomachinesashiftfromrelationaldatabasestoontologies AT hyvoneneero makingspecieschecklistsunderstandabletomachinesashiftfromrelationaldatabasestoontologies |