Cargando…

Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction

BACKGROUND: Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing met...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuhn, Stefan, Egert, Björn, Neumann, Steffen, Steinbeck, Christoph
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605476/
https://www.ncbi.nlm.nih.gov/pubmed/18817546
http://dx.doi.org/10.1186/1471-2105-9-400
_version_ 1782162859046731776
author Kuhn, Stefan
Egert, Björn
Neumann, Steffen
Steinbeck, Christoph
author_facet Kuhn, Stefan
Egert, Björn
Neumann, Steffen
Steinbeck, Christoph
author_sort Kuhn, Stefan
collection PubMed
description BACKGROUND: Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB. RESULTS: A mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error. CONCLUSION: NMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites.
format Text
id pubmed-2605476
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26054762008-12-19 Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction Kuhn, Stefan Egert, Björn Neumann, Steffen Steinbeck, Christoph BMC Bioinformatics Methodology Article BACKGROUND: Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB. RESULTS: A mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error. CONCLUSION: NMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites. BioMed Central 2008-09-25 /pmc/articles/PMC2605476/ /pubmed/18817546 http://dx.doi.org/10.1186/1471-2105-9-400 Text en Copyright © 2008 Kuhn et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Kuhn, Stefan
Egert, Björn
Neumann, Steffen
Steinbeck, Christoph
Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
title Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
title_full Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
title_fullStr Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
title_full_unstemmed Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
title_short Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
title_sort building blocks for automated elucidation of metabolites: machine learning methods for nmr prediction
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605476/
https://www.ncbi.nlm.nih.gov/pubmed/18817546
http://dx.doi.org/10.1186/1471-2105-9-400
work_keys_str_mv AT kuhnstefan buildingblocksforautomatedelucidationofmetabolitesmachinelearningmethodsfornmrprediction
AT egertbjorn buildingblocksforautomatedelucidationofmetabolitesmachinelearningmethodsfornmrprediction
AT neumannsteffen buildingblocksforautomatedelucidationofmetabolitesmachinelearningmethodsfornmrprediction
AT steinbeckchristoph buildingblocksforautomatedelucidationofmetabolitesmachinelearningmethodsfornmrprediction