Cargando…
BioCarian: search engine for exploratory searches in heterogeneous biological databases
BACKGROUND: There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semanti...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5625622/ https://www.ncbi.nlm.nih.gov/pubmed/28969593 http://dx.doi.org/10.1186/s12859-017-1840-4 |
_version_ | 1783268416394625024 |
---|---|
author | Zaki, Nazar Tennakoon, Chandana |
author_facet | Zaki, Nazar Tennakoon, Chandana |
author_sort | Zaki, Nazar |
collection | PubMed |
description | BACKGROUND: There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. RESULTS: We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com. CONCLUSIONS: We have developed a search engine to explore RDF databases that can be used by both novice and advanced users. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1840-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5625622 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56256222017-10-12 BioCarian: search engine for exploratory searches in heterogeneous biological databases Zaki, Nazar Tennakoon, Chandana BMC Bioinformatics Software BACKGROUND: There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. RESULTS: We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com. CONCLUSIONS: We have developed a search engine to explore RDF databases that can be used by both novice and advanced users. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1840-4) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-02 /pmc/articles/PMC5625622/ /pubmed/28969593 http://dx.doi.org/10.1186/s12859-017-1840-4 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Zaki, Nazar Tennakoon, Chandana BioCarian: search engine for exploratory searches in heterogeneous biological databases |
title | BioCarian: search engine for exploratory searches in heterogeneous biological databases |
title_full | BioCarian: search engine for exploratory searches in heterogeneous biological databases |
title_fullStr | BioCarian: search engine for exploratory searches in heterogeneous biological databases |
title_full_unstemmed | BioCarian: search engine for exploratory searches in heterogeneous biological databases |
title_short | BioCarian: search engine for exploratory searches in heterogeneous biological databases |
title_sort | biocarian: search engine for exploratory searches in heterogeneous biological databases |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5625622/ https://www.ncbi.nlm.nih.gov/pubmed/28969593 http://dx.doi.org/10.1186/s12859-017-1840-4 |
work_keys_str_mv | AT zakinazar biocariansearchengineforexploratorysearchesinheterogeneousbiologicaldatabases AT tennakoonchandana biocariansearchengineforexploratorysearchesinheterogeneousbiologicaldatabases |