Cargando…

S3QL: A distributed domain specific language for controlled semantic integration of life sciences data

BACKGROUND: The value and usefulness of data increases when it is explicitly interlinked with related data. This is the core principle of Linked Data. For life sciences researchers, harnessing the power of Linked Data to improve biological discovery is still challenged by a need to keep pace with ra...

Descripción completa

Detalles Bibliográficos
Autores principales: Deus , Helena F, Correa, Miriã C, Stanislaus, Romesh, Miragaia, Maria, Maass, Wolfgang, de Lencastre, Hermínia, Fox, Ronan, Almeida, Jonas S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3155508/
https://www.ncbi.nlm.nih.gov/pubmed/21756325
http://dx.doi.org/10.1186/1471-2105-12-285
_version_ 1782210128033873920
author Deus , Helena F
Correa, Miriã C
Stanislaus, Romesh
Miragaia, Maria
Maass, Wolfgang
de Lencastre, Hermínia
Fox, Ronan
Almeida, Jonas S
author_facet Deus , Helena F
Correa, Miriã C
Stanislaus, Romesh
Miragaia, Maria
Maass, Wolfgang
de Lencastre, Hermínia
Fox, Ronan
Almeida, Jonas S
author_sort Deus , Helena F
collection PubMed
description BACKGROUND: The value and usefulness of data increases when it is explicitly interlinked with related data. This is the core principle of Linked Data. For life sciences researchers, harnessing the power of Linked Data to improve biological discovery is still challenged by a need to keep pace with rapidly evolving domains and requirements for collaboration and control as well as with the reference semantic web ontologies and standards. Knowledge organization systems (KOSs) can provide an abstraction for publishing biological discoveries as Linked Data without complicating transactions with contextual minutia such as provenance and access control. We have previously described the Simple Sloppy Semantic Database (S3DB) as an efficient model for creating knowledge organization systems using Linked Data best practices with explicit distinction between domain and instantiation and support for a permission control mechanism that automatically migrates between the two. In this report we present a domain specific language, the S3DB query language (S3QL), to operate on its underlying core model and facilitate management of Linked Data. RESULTS: Reflecting the data driven nature of our approach, S3QL has been implemented as an application programming interface for S3DB systems hosting biomedical data, and its syntax was subsequently generalized beyond the S3DB core model. This achievement is illustrated with the assembly of an S3QL query to manage entities from the Simple Knowledge Organization System. The illustrative use cases include gastrointestinal clinical trials, genomic characterization of cancer by The Cancer Genome Atlas (TCGA) and molecular epidemiology of infectious diseases. CONCLUSIONS: S3QL was found to provide a convenient mechanism to represent context for interoperation between public and private datasets hosted at biomedical research institutions and linked data formalisms.
format Online
Article
Text
id pubmed-3155508
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31555082011-08-13 S3QL: A distributed domain specific language for controlled semantic integration of life sciences data Deus , Helena F Correa, Miriã C Stanislaus, Romesh Miragaia, Maria Maass, Wolfgang de Lencastre, Hermínia Fox, Ronan Almeida, Jonas S BMC Bioinformatics Research Article BACKGROUND: The value and usefulness of data increases when it is explicitly interlinked with related data. This is the core principle of Linked Data. For life sciences researchers, harnessing the power of Linked Data to improve biological discovery is still challenged by a need to keep pace with rapidly evolving domains and requirements for collaboration and control as well as with the reference semantic web ontologies and standards. Knowledge organization systems (KOSs) can provide an abstraction for publishing biological discoveries as Linked Data without complicating transactions with contextual minutia such as provenance and access control. We have previously described the Simple Sloppy Semantic Database (S3DB) as an efficient model for creating knowledge organization systems using Linked Data best practices with explicit distinction between domain and instantiation and support for a permission control mechanism that automatically migrates between the two. In this report we present a domain specific language, the S3DB query language (S3QL), to operate on its underlying core model and facilitate management of Linked Data. RESULTS: Reflecting the data driven nature of our approach, S3QL has been implemented as an application programming interface for S3DB systems hosting biomedical data, and its syntax was subsequently generalized beyond the S3DB core model. This achievement is illustrated with the assembly of an S3QL query to manage entities from the Simple Knowledge Organization System. The illustrative use cases include gastrointestinal clinical trials, genomic characterization of cancer by The Cancer Genome Atlas (TCGA) and molecular epidemiology of infectious diseases. CONCLUSIONS: S3QL was found to provide a convenient mechanism to represent context for interoperation between public and private datasets hosted at biomedical research institutions and linked data formalisms. BioMed Central 2011-07-14 /pmc/articles/PMC3155508/ /pubmed/21756325 http://dx.doi.org/10.1186/1471-2105-12-285 Text en Copyright ©2011 Deus et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Deus , Helena F
Correa, Miriã C
Stanislaus, Romesh
Miragaia, Maria
Maass, Wolfgang
de Lencastre, Hermínia
Fox, Ronan
Almeida, Jonas S
S3QL: A distributed domain specific language for controlled semantic integration of life sciences data
title S3QL: A distributed domain specific language for controlled semantic integration of life sciences data
title_full S3QL: A distributed domain specific language for controlled semantic integration of life sciences data
title_fullStr S3QL: A distributed domain specific language for controlled semantic integration of life sciences data
title_full_unstemmed S3QL: A distributed domain specific language for controlled semantic integration of life sciences data
title_short S3QL: A distributed domain specific language for controlled semantic integration of life sciences data
title_sort s3ql: a distributed domain specific language for controlled semantic integration of life sciences data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3155508/
https://www.ncbi.nlm.nih.gov/pubmed/21756325
http://dx.doi.org/10.1186/1471-2105-12-285
work_keys_str_mv AT deushelenaf s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT correamiriac s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT stanislausromesh s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT miragaiamaria s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT maasswolfgang s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT delencastreherminia s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT foxronan s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata
AT almeidajonass s3qladistributeddomainspecificlanguageforcontrolledsemanticintegrationoflifesciencesdata