Cargando…

A journey to Semantic Web query federation in the life sciences

BACKGROUND: As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets hav...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheung, Kei-Hoi, Frost, H Robert, Marshall, M Scott, Prud'hommeaux, Eric, Samwald, Matthias, Zhao, Jun, Paschke, Adrian
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2755818/
https://www.ncbi.nlm.nih.gov/pubmed/19796394
http://dx.doi.org/10.1186/1471-2105-10-S10-S10
_version_ 1782172471898669056
author Cheung, Kei-Hoi
Frost, H Robert
Marshall, M Scott
Prud'hommeaux, Eric
Samwald, Matthias
Zhao, Jun
Paschke, Adrian
author_facet Cheung, Kei-Hoi
Frost, H Robert
Marshall, M Scott
Prud'hommeaux, Eric
Samwald, Matthias
Zhao, Jun
Paschke, Adrian
author_sort Cheung, Kei-Hoi
collection PubMed
description BACKGROUND: As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these technological approaches can be used to support dynamic query federation. As a community effort, the BioRDF task force, within the Semantic Web for Health Care and Life Sciences Interest Group, is exploring how these emerging approaches can be utilized to execute distributed queries across different neuroscience data sources. METHODS AND RESULTS: We have created two health care and life science knowledge bases. We have explored a variety of Semantic Web approaches to describe, map, and dynamically query multiple datasets. We have demonstrated several federation approaches that integrate diverse types of information about neurons and receptors that play an important role in basic, clinical, and translational neuroscience research. Particularly, we have created a prototype receptor explorer which uses OWL mappings to provide an integrated list of receptors and executes individual queries against different SPARQL endpoints. We have also employed the AIDA Toolkit, which is directed at groups of knowledge workers who cooperatively search, annotate, interpret, and enrich large collections of heterogeneous documents from diverse locations. We have explored a tool called "FeDeRate", which enables a global SPARQL query to be decomposed into subqueries against the remote databases offering either SPARQL or SQL query interfaces. Finally, we have explored how to use the vocabulary of interlinked Datasets (voiD) to create metadata for describing datasets exposed as Linked Data URIs or SPARQL endpoints. CONCLUSION: We have demonstrated the use of a set of novel and state-of-the-art Semantic Web technologies in support of a neuroscience query federation scenario. We have identified both the strengths and weaknesses of these technologies. While Semantic Web offers a global data model including the use of Uniform Resource Identifiers (URI's), the proliferation of semantically-equivalent URI's hinders large scale data integration. Our work helps direct research and tool development, which will be of benefit to this community.
format Text
id pubmed-2755818
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27558182009-10-03 A journey to Semantic Web query federation in the life sciences Cheung, Kei-Hoi Frost, H Robert Marshall, M Scott Prud'hommeaux, Eric Samwald, Matthias Zhao, Jun Paschke, Adrian BMC Bioinformatics Research BACKGROUND: As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these technological approaches can be used to support dynamic query federation. As a community effort, the BioRDF task force, within the Semantic Web for Health Care and Life Sciences Interest Group, is exploring how these emerging approaches can be utilized to execute distributed queries across different neuroscience data sources. METHODS AND RESULTS: We have created two health care and life science knowledge bases. We have explored a variety of Semantic Web approaches to describe, map, and dynamically query multiple datasets. We have demonstrated several federation approaches that integrate diverse types of information about neurons and receptors that play an important role in basic, clinical, and translational neuroscience research. Particularly, we have created a prototype receptor explorer which uses OWL mappings to provide an integrated list of receptors and executes individual queries against different SPARQL endpoints. We have also employed the AIDA Toolkit, which is directed at groups of knowledge workers who cooperatively search, annotate, interpret, and enrich large collections of heterogeneous documents from diverse locations. We have explored a tool called "FeDeRate", which enables a global SPARQL query to be decomposed into subqueries against the remote databases offering either SPARQL or SQL query interfaces. Finally, we have explored how to use the vocabulary of interlinked Datasets (voiD) to create metadata for describing datasets exposed as Linked Data URIs or SPARQL endpoints. CONCLUSION: We have demonstrated the use of a set of novel and state-of-the-art Semantic Web technologies in support of a neuroscience query federation scenario. We have identified both the strengths and weaknesses of these technologies. While Semantic Web offers a global data model including the use of Uniform Resource Identifiers (URI's), the proliferation of semantically-equivalent URI's hinders large scale data integration. Our work helps direct research and tool development, which will be of benefit to this community. BioMed Central 2009-10-01 /pmc/articles/PMC2755818/ /pubmed/19796394 http://dx.doi.org/10.1186/1471-2105-10-S10-S10 Text en © Cheung et al; licensee BioMed Central Ltd. 2009 https://creativecommons.org/licenses/by/2.0/This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Cheung, Kei-Hoi
Frost, H Robert
Marshall, M Scott
Prud'hommeaux, Eric
Samwald, Matthias
Zhao, Jun
Paschke, Adrian
A journey to Semantic Web query federation in the life sciences
title A journey to Semantic Web query federation in the life sciences
title_full A journey to Semantic Web query federation in the life sciences
title_fullStr A journey to Semantic Web query federation in the life sciences
title_full_unstemmed A journey to Semantic Web query federation in the life sciences
title_short A journey to Semantic Web query federation in the life sciences
title_sort journey to semantic web query federation in the life sciences
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2755818/
https://www.ncbi.nlm.nih.gov/pubmed/19796394
http://dx.doi.org/10.1186/1471-2105-10-S10-S10
work_keys_str_mv AT cheungkeihoi ajourneytosemanticwebqueryfederationinthelifesciences
AT frosthrobert ajourneytosemanticwebqueryfederationinthelifesciences
AT marshallmscott ajourneytosemanticwebqueryfederationinthelifesciences
AT prudhommeauxeric ajourneytosemanticwebqueryfederationinthelifesciences
AT samwaldmatthias ajourneytosemanticwebqueryfederationinthelifesciences
AT zhaojun ajourneytosemanticwebqueryfederationinthelifesciences
AT paschkeadrian ajourneytosemanticwebqueryfederationinthelifesciences
AT cheungkeihoi journeytosemanticwebqueryfederationinthelifesciences
AT frosthrobert journeytosemanticwebqueryfederationinthelifesciences
AT marshallmscott journeytosemanticwebqueryfederationinthelifesciences
AT prudhommeauxeric journeytosemanticwebqueryfederationinthelifesciences
AT samwaldmatthias journeytosemanticwebqueryfederationinthelifesciences
AT zhaojun journeytosemanticwebqueryfederationinthelifesciences
AT paschkeadrian journeytosemanticwebqueryfederationinthelifesciences