Cargando…

PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets

BACKGROUND: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Djokic-Petrovic, Marija, Cvjetkovic, Vladimir, Yang, Jeremy, Zivanovic, Marko, Wild, David J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5607505/ https://www.ncbi.nlm.nih.gov/pubmed/28931422 http://dx.doi.org/10.1186/s13326-017-0151-z

_version_	1783265299461570560
author	Djokic-Petrovic, Marija Cvjetkovic, Vladimir Yang, Jeremy Zivanovic, Marko Wild, David J.
author_facet	Djokic-Petrovic, Marija Cvjetkovic, Vladimir Yang, Jeremy Zivanovic, Marko Wild, David J.
author_sort	Djokic-Petrovic, Marija
collection	PubMed
description	BACKGROUND: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. RESULTS: PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. CONCLUSIONS: The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel “similar data items detection” algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust.
format	Online Article Text
id	pubmed-5607505
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-56075052017-09-24 PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets Djokic-Petrovic, Marija Cvjetkovic, Vladimir Yang, Jeremy Zivanovic, Marko Wild, David J. J Biomed Semantics Software BACKGROUND: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. RESULTS: PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. CONCLUSIONS: The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel “similar data items detection” algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust. BioMed Central 2017-09-20 /pmc/articles/PMC5607505/ /pubmed/28931422 http://dx.doi.org/10.1186/s13326-017-0151-z Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Software Djokic-Petrovic, Marija Cvjetkovic, Vladimir Yang, Jeremy Zivanovic, Marko Wild, David J. PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
title	PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
title_full	PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
title_fullStr	PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
title_full_unstemmed	PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
title_short	PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
title_sort	pibas fedsparql: a web-based platform for integration and exploration of bioinformatics datasets
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5607505/ https://www.ncbi.nlm.nih.gov/pubmed/28931422 http://dx.doi.org/10.1186/s13326-017-0151-z
work_keys_str_mv	AT djokicpetrovicmarija pibasfedsparqlawebbasedplatformforintegrationandexplorationofbioinformaticsdatasets AT cvjetkovicvladimir pibasfedsparqlawebbasedplatformforintegrationandexplorationofbioinformaticsdatasets AT yangjeremy pibasfedsparqlawebbasedplatformforintegrationandexplorationofbioinformaticsdatasets AT zivanovicmarko pibasfedsparqlawebbasedplatformforintegrationandexplorationofbioinformaticsdatasets AT wilddavidj pibasfedsparqlawebbasedplatformforintegrationandexplorationofbioinformaticsdatasets

PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets

Ejemplares similares