Cargando…

Processing SPARQL queries with regular expressions in RDF databases

BACKGROUND: As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an im...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Jinsoo, Pham, Minh-Duc, Lee, Jihwan, Han, Wook-Shin, Cho, Hune, Yu, Hwanjo, Lee, Jeong-Hoon
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3073186/
https://www.ncbi.nlm.nih.gov/pubmed/21489225
http://dx.doi.org/10.1186/1471-2105-12-S2-S6
_version_ 1782201617791057920
author Lee, Jinsoo
Pham, Minh-Duc
Lee, Jihwan
Han, Wook-Shin
Cho, Hune
Yu, Hwanjo
Lee, Jeong-Hoon
author_facet Lee, Jinsoo
Pham, Minh-Duc
Lee, Jihwan
Han, Wook-Shin
Cho, Hune
Yu, Hwanjo
Lee, Jeong-Hoon
author_sort Lee, Jinsoo
collection PubMed
description BACKGROUND: As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. RESULTS: In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. CONCLUSIONS: Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.
format Text
id pubmed-3073186
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30731862011-04-12 Processing SPARQL queries with regular expressions in RDF databases Lee, Jinsoo Pham, Minh-Duc Lee, Jihwan Han, Wook-Shin Cho, Hune Yu, Hwanjo Lee, Jeong-Hoon BMC Bioinformatics Proceedings BACKGROUND: As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. RESULTS: In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. CONCLUSIONS: Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns. BioMed Central 2011-03-29 /pmc/articles/PMC3073186/ /pubmed/21489225 http://dx.doi.org/10.1186/1471-2105-12-S2-S6 Text en Copyright ©2011 Lee et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Lee, Jinsoo
Pham, Minh-Duc
Lee, Jihwan
Han, Wook-Shin
Cho, Hune
Yu, Hwanjo
Lee, Jeong-Hoon
Processing SPARQL queries with regular expressions in RDF databases
title Processing SPARQL queries with regular expressions in RDF databases
title_full Processing SPARQL queries with regular expressions in RDF databases
title_fullStr Processing SPARQL queries with regular expressions in RDF databases
title_full_unstemmed Processing SPARQL queries with regular expressions in RDF databases
title_short Processing SPARQL queries with regular expressions in RDF databases
title_sort processing sparql queries with regular expressions in rdf databases
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3073186/
https://www.ncbi.nlm.nih.gov/pubmed/21489225
http://dx.doi.org/10.1186/1471-2105-12-S2-S6
work_keys_str_mv AT leejinsoo processingsparqlquerieswithregularexpressionsinrdfdatabases
AT phamminhduc processingsparqlquerieswithregularexpressionsinrdfdatabases
AT leejihwan processingsparqlquerieswithregularexpressionsinrdfdatabases
AT hanwookshin processingsparqlquerieswithregularexpressionsinrdfdatabases
AT chohune processingsparqlquerieswithregularexpressionsinrdfdatabases
AT yuhwanjo processingsparqlquerieswithregularexpressionsinrdfdatabases
AT leejeonghoon processingsparqlquerieswithregularexpressionsinrdfdatabases