Cargando…

Improving MetFrag with statistical learning of fragment annotations

BACKGROUND: Molecule identification is a crucial step in metabolomics and environmental sciences. Besides in silico fragmentation, as performed by MetFrag, also machine learning and statistical methods evolved, showing an improvement in molecule annotation based on MS/MS data. In this work we presen...

Descripción completa

Detalles Bibliográficos
Autores principales: Ruttkies, Christoph, Neumann, Steffen, Posch, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612146/
https://www.ncbi.nlm.nih.gov/pubmed/31277571
http://dx.doi.org/10.1186/s12859-019-2954-7
_version_ 1783432833990131712
author Ruttkies, Christoph
Neumann, Steffen
Posch, Stefan
author_facet Ruttkies, Christoph
Neumann, Steffen
Posch, Stefan
author_sort Ruttkies, Christoph
collection PubMed
description BACKGROUND: Molecule identification is a crucial step in metabolomics and environmental sciences. Besides in silico fragmentation, as performed by MetFrag, also machine learning and statistical methods evolved, showing an improvement in molecule annotation based on MS/MS data. In this work we present a new statistical scoring method where annotations of m/z fragment peaks to fragment-structures are learned in a training step. Based on a Bayesian model, two additional scoring terms are integrated into the new MetFrag2.4.5 and evaluated on the test data set of the CASMI 2016 contest. RESULTS: The results on the 87 MS/MS spectra from positive and negative mode show a substantial improvement of the results compared to submissions made by the former MetFrag approach. Top1 rankings increased from 5 to 21 and Top10 rankings from 39 to 55 both showing higher values than for CSI:IOKR, the winner of the CASMI 2016 contest. For the negative mode spectra, MetFrag’s statistical scoring outperforms all other participants which submitted results for this type of spectra. CONCLUSIONS: This study shows how statistical learning can improve molecular structure identification based on MS/MS data compared on the same method using combinatorial in silico fragmentation only. MetFrag2.4.5 shows especially in negative mode a better performance compared to the other participating approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2954-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6612146
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66121462019-07-16 Improving MetFrag with statistical learning of fragment annotations Ruttkies, Christoph Neumann, Steffen Posch, Stefan BMC Bioinformatics Research Article BACKGROUND: Molecule identification is a crucial step in metabolomics and environmental sciences. Besides in silico fragmentation, as performed by MetFrag, also machine learning and statistical methods evolved, showing an improvement in molecule annotation based on MS/MS data. In this work we present a new statistical scoring method where annotations of m/z fragment peaks to fragment-structures are learned in a training step. Based on a Bayesian model, two additional scoring terms are integrated into the new MetFrag2.4.5 and evaluated on the test data set of the CASMI 2016 contest. RESULTS: The results on the 87 MS/MS spectra from positive and negative mode show a substantial improvement of the results compared to submissions made by the former MetFrag approach. Top1 rankings increased from 5 to 21 and Top10 rankings from 39 to 55 both showing higher values than for CSI:IOKR, the winner of the CASMI 2016 contest. For the negative mode spectra, MetFrag’s statistical scoring outperforms all other participants which submitted results for this type of spectra. CONCLUSIONS: This study shows how statistical learning can improve molecular structure identification based on MS/MS data compared on the same method using combinatorial in silico fragmentation only. MetFrag2.4.5 shows especially in negative mode a better performance compared to the other participating approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2954-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-05 /pmc/articles/PMC6612146/ /pubmed/31277571 http://dx.doi.org/10.1186/s12859-019-2954-7 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Ruttkies, Christoph
Neumann, Steffen
Posch, Stefan
Improving MetFrag with statistical learning of fragment annotations
title Improving MetFrag with statistical learning of fragment annotations
title_full Improving MetFrag with statistical learning of fragment annotations
title_fullStr Improving MetFrag with statistical learning of fragment annotations
title_full_unstemmed Improving MetFrag with statistical learning of fragment annotations
title_short Improving MetFrag with statistical learning of fragment annotations
title_sort improving metfrag with statistical learning of fragment annotations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612146/
https://www.ncbi.nlm.nih.gov/pubmed/31277571
http://dx.doi.org/10.1186/s12859-019-2954-7
work_keys_str_mv AT ruttkieschristoph improvingmetfragwithstatisticallearningoffragmentannotations
AT neumannsteffen improvingmetfragwithstatisticallearningoffragmentannotations
AT poschstefan improvingmetfragwithstatisticallearningoffragmentannotations