Cargando…

PubChemRDF: towards the semantic annotation of PubChem compound and substance databases

BACKGROUND: PubChem is an open repository for chemical structures, biological activities and biomedical annotations. Semantic Web technologies are emerging as an increasingly important approach to distribute and integrate scientific data. Exposing PubChem data to Semantic Web services may help enabl...

Descripción completa

Detalles Bibliográficos
Autores principales: Fu, Gang, Batchelor, Colin, Dumontier, Michel, Hastings, Janna, Willighagen, Egon, Bolton, Evan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4500850/
https://www.ncbi.nlm.nih.gov/pubmed/26175801
http://dx.doi.org/10.1186/s13321-015-0084-4
_version_ 1782380962901917696
author Fu, Gang
Batchelor, Colin
Dumontier, Michel
Hastings, Janna
Willighagen, Egon
Bolton, Evan
author_facet Fu, Gang
Batchelor, Colin
Dumontier, Michel
Hastings, Janna
Willighagen, Egon
Bolton, Evan
author_sort Fu, Gang
collection PubMed
description BACKGROUND: PubChem is an open repository for chemical structures, biological activities and biomedical annotations. Semantic Web technologies are emerging as an increasingly important approach to distribute and integrate scientific data. Exposing PubChem data to Semantic Web services may help enable automated data integration and management, as well as facilitate interoperable web applications. DESCRIPTION: This work, one of a series covering the PubChemRDF project, describes an approach to translate PubChem Substance and Compound information into Resource Description Framework (RDF) format. Basic examples are provided to demonstrate its use. The aim of this effort is to provide two new primary benefits to researchers in a cost-effective manner. Firstly, we aim to remove the inherent limitations of using the web-based resource PubChem by allowing a researcher to use readily available semantic technologies (namely, RDF triple stores and their corresponding SPARQL query engines) to query and analyze PubChem data on local computing resources. Secondly, this work intends to help improve data sharing, analysis, and integration of PubChem data to resources external to NCBI and across scientific domains, by means of the association of PubChem data to existing ontological frameworks, including CHEMical INFormation ontology, Semanticscience Integrated Ontology, and others. CONCLUSIONS: With the goal of semantically describing information available in the PubChem archive, pre-existing ontological frameworks were used, rather than creating new ones. Semantic relationships between compounds and substances, chemical descriptors associated with compounds and substances, interrelationships between chemicals, as well as provenance and attribute metadata of substances are described. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-015-0084-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4500850
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-45008502015-07-15 PubChemRDF: towards the semantic annotation of PubChem compound and substance databases Fu, Gang Batchelor, Colin Dumontier, Michel Hastings, Janna Willighagen, Egon Bolton, Evan J Cheminform Database BACKGROUND: PubChem is an open repository for chemical structures, biological activities and biomedical annotations. Semantic Web technologies are emerging as an increasingly important approach to distribute and integrate scientific data. Exposing PubChem data to Semantic Web services may help enable automated data integration and management, as well as facilitate interoperable web applications. DESCRIPTION: This work, one of a series covering the PubChemRDF project, describes an approach to translate PubChem Substance and Compound information into Resource Description Framework (RDF) format. Basic examples are provided to demonstrate its use. The aim of this effort is to provide two new primary benefits to researchers in a cost-effective manner. Firstly, we aim to remove the inherent limitations of using the web-based resource PubChem by allowing a researcher to use readily available semantic technologies (namely, RDF triple stores and their corresponding SPARQL query engines) to query and analyze PubChem data on local computing resources. Secondly, this work intends to help improve data sharing, analysis, and integration of PubChem data to resources external to NCBI and across scientific domains, by means of the association of PubChem data to existing ontological frameworks, including CHEMical INFormation ontology, Semanticscience Integrated Ontology, and others. CONCLUSIONS: With the goal of semantically describing information available in the PubChem archive, pre-existing ontological frameworks were used, rather than creating new ones. Semantic relationships between compounds and substances, chemical descriptors associated with compounds and substances, interrelationships between chemicals, as well as provenance and attribute metadata of substances are described. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-015-0084-4) contains supplementary material, which is available to authorized users. Springer International Publishing 2015-07-14 /pmc/articles/PMC4500850/ /pubmed/26175801 http://dx.doi.org/10.1186/s13321-015-0084-4 Text en © Fu et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Database
Fu, Gang
Batchelor, Colin
Dumontier, Michel
Hastings, Janna
Willighagen, Egon
Bolton, Evan
PubChemRDF: towards the semantic annotation of PubChem compound and substance databases
title PubChemRDF: towards the semantic annotation of PubChem compound and substance databases
title_full PubChemRDF: towards the semantic annotation of PubChem compound and substance databases
title_fullStr PubChemRDF: towards the semantic annotation of PubChem compound and substance databases
title_full_unstemmed PubChemRDF: towards the semantic annotation of PubChem compound and substance databases
title_short PubChemRDF: towards the semantic annotation of PubChem compound and substance databases
title_sort pubchemrdf: towards the semantic annotation of pubchem compound and substance databases
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4500850/
https://www.ncbi.nlm.nih.gov/pubmed/26175801
http://dx.doi.org/10.1186/s13321-015-0084-4
work_keys_str_mv AT fugang pubchemrdftowardsthesemanticannotationofpubchemcompoundandsubstancedatabases
AT batchelorcolin pubchemrdftowardsthesemanticannotationofpubchemcompoundandsubstancedatabases
AT dumontiermichel pubchemrdftowardsthesemanticannotationofpubchemcompoundandsubstancedatabases
AT hastingsjanna pubchemrdftowardsthesemanticannotationofpubchemcompoundandsubstancedatabases
AT willighagenegon pubchemrdftowardsthesemanticannotationofpubchemcompoundandsubstancedatabases
AT boltonevan pubchemrdftowardsthesemanticannotationofpubchemcompoundandsubstancedatabases