Cargando…

Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration

BACKGROUND: Over the past several centuries, chemistry has permeated virtually every facet of human lifestyle, enriching fields as diverse as medicine, agriculture, manufacturing, warfare, and electronics, among numerous others. Unfortunately, application-specific, incompatible chemical information...

Descripción completa

Detalles Bibliográficos
Autores principales: Chepelev, Leonid L, Dumontier, Michel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3121712/
https://www.ncbi.nlm.nih.gov/pubmed/21595881
http://dx.doi.org/10.1186/1758-2946-3-20
_version_ 1782206856160083968
author Chepelev, Leonid L
Dumontier, Michel
author_facet Chepelev, Leonid L
Dumontier, Michel
author_sort Chepelev, Leonid L
collection PubMed
description BACKGROUND: Over the past several centuries, chemistry has permeated virtually every facet of human lifestyle, enriching fields as diverse as medicine, agriculture, manufacturing, warfare, and electronics, among numerous others. Unfortunately, application-specific, incompatible chemical information formats and representation strategies have emerged as a result of such diverse adoption of chemistry. Although a number of efforts have been dedicated to unifying the computational representation of chemical information, disparities between the various chemical databases still persist and stand in the way of cross-domain, interdisciplinary investigations. Through a common syntax and formal semantics, Semantic Web technology offers the ability to accurately represent, integrate, reason about and query across diverse chemical information. RESULTS: Here we specify and implement the Chemical Entity Semantic Specification (CHESS) for the representation of polyatomic chemical entities, their substructures, bonds, atoms, and reactions using Semantic Web technologies. CHESS provides means to capture aspects of their corresponding chemical descriptors, connectivity, functional composition, and geometric structure while specifying mechanisms for data provenance. We demonstrate that using our readily extensible specification, it is possible to efficiently integrate multiple disparate chemical data sources, while retaining appropriate correspondence of chemical descriptors, with very little additional effort. We demonstrate the impact of some of our representational decisions on the performance of chemically-aware knowledgebase searching and rudimentary reaction candidate selection. Finally, we provide access to the tools necessary to carry out chemical entity encoding in CHESS, along with a sample knowledgebase. CONCLUSIONS: By harnessing the power of Semantic Web technologies with CHESS, it is possible to provide a means of facile cross-domain chemical knowledge integration with full preservation of data correspondence and provenance. Our representation builds on existing cheminformatics technologies and, by the virtue of RDF specification, remains flexible and amenable to application- and domain-specific annotations without compromising chemical data integration. We conclude that the adoption of a consistent and semantically-enabled chemical specification is imperative for surviving the coming chemical data deluge and supporting systems science research.
format Online
Article
Text
id pubmed-3121712
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31217122011-06-24 Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration Chepelev, Leonid L Dumontier, Michel J Cheminform Methodology BACKGROUND: Over the past several centuries, chemistry has permeated virtually every facet of human lifestyle, enriching fields as diverse as medicine, agriculture, manufacturing, warfare, and electronics, among numerous others. Unfortunately, application-specific, incompatible chemical information formats and representation strategies have emerged as a result of such diverse adoption of chemistry. Although a number of efforts have been dedicated to unifying the computational representation of chemical information, disparities between the various chemical databases still persist and stand in the way of cross-domain, interdisciplinary investigations. Through a common syntax and formal semantics, Semantic Web technology offers the ability to accurately represent, integrate, reason about and query across diverse chemical information. RESULTS: Here we specify and implement the Chemical Entity Semantic Specification (CHESS) for the representation of polyatomic chemical entities, their substructures, bonds, atoms, and reactions using Semantic Web technologies. CHESS provides means to capture aspects of their corresponding chemical descriptors, connectivity, functional composition, and geometric structure while specifying mechanisms for data provenance. We demonstrate that using our readily extensible specification, it is possible to efficiently integrate multiple disparate chemical data sources, while retaining appropriate correspondence of chemical descriptors, with very little additional effort. We demonstrate the impact of some of our representational decisions on the performance of chemically-aware knowledgebase searching and rudimentary reaction candidate selection. Finally, we provide access to the tools necessary to carry out chemical entity encoding in CHESS, along with a sample knowledgebase. CONCLUSIONS: By harnessing the power of Semantic Web technologies with CHESS, it is possible to provide a means of facile cross-domain chemical knowledge integration with full preservation of data correspondence and provenance. Our representation builds on existing cheminformatics technologies and, by the virtue of RDF specification, remains flexible and amenable to application- and domain-specific annotations without compromising chemical data integration. We conclude that the adoption of a consistent and semantically-enabled chemical specification is imperative for surviving the coming chemical data deluge and supporting systems science research. BioMed Central 2011-05-19 /pmc/articles/PMC3121712/ /pubmed/21595881 http://dx.doi.org/10.1186/1758-2946-3-20 Text en Copyright ©2011 Chepelev and Dumontier; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology
Chepelev, Leonid L
Dumontier, Michel
Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration
title Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration
title_full Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration
title_fullStr Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration
title_full_unstemmed Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration
title_short Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration
title_sort chemical entity semantic specification: knowledge representation for efficient semantic cheminformatics and facile data integration
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3121712/
https://www.ncbi.nlm.nih.gov/pubmed/21595881
http://dx.doi.org/10.1186/1758-2946-3-20
work_keys_str_mv AT chepelevleonidl chemicalentitysemanticspecificationknowledgerepresentationforefficientsemanticcheminformaticsandfaciledataintegration
AT dumontiermichel chemicalentitysemanticspecificationknowledgerepresentationforefficientsemanticcheminformaticsandfaciledataintegration