Cargando…
Advanced SPARQL querying in small molecule databases
BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. Ho...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4893829/ https://www.ncbi.nlm.nih.gov/pubmed/27275187 http://dx.doi.org/10.1186/s13321-016-0144-4 |
_version_ | 1782435624371879936 |
---|---|
author | Galgonek, Jakub Hurt, Tomáš Michlíková, Vendula Onderka, Petr Schwarz, Jan Vondrášek, Jiří |
author_facet | Galgonek, Jakub Hurt, Tomáš Michlíková, Vendula Onderka, Petr Schwarz, Jan Vondrášek, Jiří |
author_sort | Galgonek, Jakub |
collection | PubMed |
description | BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. RESULTS: We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. CONCLUSIONS: Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF. GRAPHICAL ABSTRACT: [Image: see text] |
format | Online Article Text |
id | pubmed-4893829 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-48938292016-06-07 Advanced SPARQL querying in small molecule databases Galgonek, Jakub Hurt, Tomáš Michlíková, Vendula Onderka, Petr Schwarz, Jan Vondrášek, Jiří J Cheminform Software BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. RESULTS: We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. CONCLUSIONS: Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF. GRAPHICAL ABSTRACT: [Image: see text] Springer International Publishing 2016-06-06 /pmc/articles/PMC4893829/ /pubmed/27275187 http://dx.doi.org/10.1186/s13321-016-0144-4 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Galgonek, Jakub Hurt, Tomáš Michlíková, Vendula Onderka, Petr Schwarz, Jan Vondrášek, Jiří Advanced SPARQL querying in small molecule databases |
title | Advanced SPARQL querying in small molecule databases |
title_full | Advanced SPARQL querying in small molecule databases |
title_fullStr | Advanced SPARQL querying in small molecule databases |
title_full_unstemmed | Advanced SPARQL querying in small molecule databases |
title_short | Advanced SPARQL querying in small molecule databases |
title_sort | advanced sparql querying in small molecule databases |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4893829/ https://www.ncbi.nlm.nih.gov/pubmed/27275187 http://dx.doi.org/10.1186/s13321-016-0144-4 |
work_keys_str_mv | AT galgonekjakub advancedsparqlqueryinginsmallmoleculedatabases AT hurttomas advancedsparqlqueryinginsmallmoleculedatabases AT michlikovavendula advancedsparqlqueryinginsmallmoleculedatabases AT onderkapetr advancedsparqlqueryinginsmallmoleculedatabases AT schwarzjan advancedsparqlqueryinginsmallmoleculedatabases AT vondrasekjiri advancedsparqlqueryinginsmallmoleculedatabases |