Cargando…

Self-organizing ontology of biochemically relevant small molecules

BACKGROUND: The advent of high-throughput experimentation in biochemistry has led to the generation of vast amounts of chemical data, necessitating the development of novel analysis, characterization, and cataloguing techniques and tools. Recently, a movement to publically release such data has adva...

Descripción completa

Detalles Bibliográficos
Autores principales: Chepelev, Leonid L, Hastings, Janna, Ennis, Marcus, Steinbeck, Christoph, Dumontier, Michel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267649/
https://www.ncbi.nlm.nih.gov/pubmed/22221313
http://dx.doi.org/10.1186/1471-2105-13-3
_version_ 1782222296760451072
author Chepelev, Leonid L
Hastings, Janna
Ennis, Marcus
Steinbeck, Christoph
Dumontier, Michel
author_facet Chepelev, Leonid L
Hastings, Janna
Ennis, Marcus
Steinbeck, Christoph
Dumontier, Michel
author_sort Chepelev, Leonid L
collection PubMed
description BACKGROUND: The advent of high-throughput experimentation in biochemistry has led to the generation of vast amounts of chemical data, necessitating the development of novel analysis, characterization, and cataloguing techniques and tools. Recently, a movement to publically release such data has advanced biochemical structure-activity relationship research, while providing new challenges, the biggest being the curation, annotation, and classification of this information to facilitate useful biochemical pattern analysis. Unfortunately, the human resources currently employed by the organizations supporting these efforts (e.g. ChEBI) are expanding linearly, while new useful scientific information is being released in a seemingly exponential fashion. Compounding this, currently existing chemical classification and annotation systems are not amenable to automated classification, formal and transparent chemical class definition axiomatization, facile class redefinition, or novel class integration, thus further limiting chemical ontology growth by necessitating human involvement in curation. Clearly, there is a need for the automation of this process, especially for novel chemical entities of biological interest. RESULTS: To address this, we present a formal framework based on Semantic Web technologies for the automatic design of chemical ontology which can be used for automated classification of novel entities. We demonstrate the automatic self-assembly of a structure-based chemical ontology based on 60 MeSH and 40 ChEBI chemical classes. This ontology is then used to classify 200 compounds with an accuracy of 92.7%. We extend these structure-based classes with molecular feature information and demonstrate the utility of our framework for classification of functionally relevant chemicals. Finally, we discuss an iterative approach that we envision for future biochemical ontology development. CONCLUSIONS: We conclude that the proposed methodology can ease the burden of chemical data annotators and dramatically increase their productivity. We anticipate that the use of formal logic in our proposed framework will make chemical classification criteria more transparent to humans and machines alike and will thus facilitate predictive and integrative bioactivity model development.
format Online
Article
Text
id pubmed-3267649
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32676492012-01-28 Self-organizing ontology of biochemically relevant small molecules Chepelev, Leonid L Hastings, Janna Ennis, Marcus Steinbeck, Christoph Dumontier, Michel BMC Bioinformatics Research Article BACKGROUND: The advent of high-throughput experimentation in biochemistry has led to the generation of vast amounts of chemical data, necessitating the development of novel analysis, characterization, and cataloguing techniques and tools. Recently, a movement to publically release such data has advanced biochemical structure-activity relationship research, while providing new challenges, the biggest being the curation, annotation, and classification of this information to facilitate useful biochemical pattern analysis. Unfortunately, the human resources currently employed by the organizations supporting these efforts (e.g. ChEBI) are expanding linearly, while new useful scientific information is being released in a seemingly exponential fashion. Compounding this, currently existing chemical classification and annotation systems are not amenable to automated classification, formal and transparent chemical class definition axiomatization, facile class redefinition, or novel class integration, thus further limiting chemical ontology growth by necessitating human involvement in curation. Clearly, there is a need for the automation of this process, especially for novel chemical entities of biological interest. RESULTS: To address this, we present a formal framework based on Semantic Web technologies for the automatic design of chemical ontology which can be used for automated classification of novel entities. We demonstrate the automatic self-assembly of a structure-based chemical ontology based on 60 MeSH and 40 ChEBI chemical classes. This ontology is then used to classify 200 compounds with an accuracy of 92.7%. We extend these structure-based classes with molecular feature information and demonstrate the utility of our framework for classification of functionally relevant chemicals. Finally, we discuss an iterative approach that we envision for future biochemical ontology development. CONCLUSIONS: We conclude that the proposed methodology can ease the burden of chemical data annotators and dramatically increase their productivity. We anticipate that the use of formal logic in our proposed framework will make chemical classification criteria more transparent to humans and machines alike and will thus facilitate predictive and integrative bioactivity model development. BioMed Central 2012-01-06 /pmc/articles/PMC3267649/ /pubmed/22221313 http://dx.doi.org/10.1186/1471-2105-13-3 Text en Copyright ©2012 Chepelev et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chepelev, Leonid L
Hastings, Janna
Ennis, Marcus
Steinbeck, Christoph
Dumontier, Michel
Self-organizing ontology of biochemically relevant small molecules
title Self-organizing ontology of biochemically relevant small molecules
title_full Self-organizing ontology of biochemically relevant small molecules
title_fullStr Self-organizing ontology of biochemically relevant small molecules
title_full_unstemmed Self-organizing ontology of biochemically relevant small molecules
title_short Self-organizing ontology of biochemically relevant small molecules
title_sort self-organizing ontology of biochemically relevant small molecules
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267649/
https://www.ncbi.nlm.nih.gov/pubmed/22221313
http://dx.doi.org/10.1186/1471-2105-13-3
work_keys_str_mv AT chepelevleonidl selforganizingontologyofbiochemicallyrelevantsmallmolecules
AT hastingsjanna selforganizingontologyofbiochemicallyrelevantsmallmolecules
AT ennismarcus selforganizingontologyofbiochemicallyrelevantsmallmolecules
AT steinbeckchristoph selforganizingontologyofbiochemicallyrelevantsmallmolecules
AT dumontiermichel selforganizingontologyofbiochemicallyrelevantsmallmolecules