Cargando…

Advanced SPARQL querying in small molecule databases

BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. Ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Galgonek, Jakub, Hurt, Tomáš, Michlíková, Vendula, Onderka, Petr, Schwarz, Jan, Vondrášek, Jiří
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4893829/
https://www.ncbi.nlm.nih.gov/pubmed/27275187
http://dx.doi.org/10.1186/s13321-016-0144-4
_version_ 1782435624371879936
author Galgonek, Jakub
Hurt, Tomáš
Michlíková, Vendula
Onderka, Petr
Schwarz, Jan
Vondrášek, Jiří
author_facet Galgonek, Jakub
Hurt, Tomáš
Michlíková, Vendula
Onderka, Petr
Schwarz, Jan
Vondrášek, Jiří
author_sort Galgonek, Jakub
collection PubMed
description BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. RESULTS: We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. CONCLUSIONS: Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF. GRAPHICAL ABSTRACT: [Image: see text]
format Online
Article
Text
id pubmed-4893829
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-48938292016-06-07 Advanced SPARQL querying in small molecule databases Galgonek, Jakub Hurt, Tomáš Michlíková, Vendula Onderka, Petr Schwarz, Jan Vondrášek, Jiří J Cheminform Software BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. RESULTS: We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. CONCLUSIONS: Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF. GRAPHICAL ABSTRACT: [Image: see text] Springer International Publishing 2016-06-06 /pmc/articles/PMC4893829/ /pubmed/27275187 http://dx.doi.org/10.1186/s13321-016-0144-4 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Galgonek, Jakub
Hurt, Tomáš
Michlíková, Vendula
Onderka, Petr
Schwarz, Jan
Vondrášek, Jiří
Advanced SPARQL querying in small molecule databases
title Advanced SPARQL querying in small molecule databases
title_full Advanced SPARQL querying in small molecule databases
title_fullStr Advanced SPARQL querying in small molecule databases
title_full_unstemmed Advanced SPARQL querying in small molecule databases
title_short Advanced SPARQL querying in small molecule databases
title_sort advanced sparql querying in small molecule databases
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4893829/
https://www.ncbi.nlm.nih.gov/pubmed/27275187
http://dx.doi.org/10.1186/s13321-016-0144-4
work_keys_str_mv AT galgonekjakub advancedsparqlqueryinginsmallmoleculedatabases
AT hurttomas advancedsparqlqueryinginsmallmoleculedatabases
AT michlikovavendula advancedsparqlqueryinginsmallmoleculedatabases
AT onderkapetr advancedsparqlqueryinginsmallmoleculedatabases
AT schwarzjan advancedsparqlqueryinginsmallmoleculedatabases
AT vondrasekjiri advancedsparqlqueryinginsmallmoleculedatabases