Cargando…

Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis

Data analysis for metabolomics is undergoing rapid progress thanks to the proliferation of novel tools and the standardization of existing workflows. As untargeted metabolomics datasets and experiments continue to increase in size and complexity, standardized workflows are often not sufficiently sop...

Descripción completa

Detalles Bibliográficos
Autores principales: Beirnaert, Charlie, Peeters, Laura, Meysman, Pieter, Bittremieux, Wout, Foubert, Kenn, Custers, Deborah, Van der Auwera, Anastasia, Cuykx, Matthias, Pieters, Luc, Covaci, Adrian, Laukens, Kris
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6468718/
https://www.ncbi.nlm.nih.gov/pubmed/30897797
http://dx.doi.org/10.3390/metabo9030054
_version_ 1783411498251452416
author Beirnaert, Charlie
Peeters, Laura
Meysman, Pieter
Bittremieux, Wout
Foubert, Kenn
Custers, Deborah
Van der Auwera, Anastasia
Cuykx, Matthias
Pieters, Luc
Covaci, Adrian
Laukens, Kris
author_facet Beirnaert, Charlie
Peeters, Laura
Meysman, Pieter
Bittremieux, Wout
Foubert, Kenn
Custers, Deborah
Van der Auwera, Anastasia
Cuykx, Matthias
Pieters, Luc
Covaci, Adrian
Laukens, Kris
author_sort Beirnaert, Charlie
collection PubMed
description Data analysis for metabolomics is undergoing rapid progress thanks to the proliferation of novel tools and the standardization of existing workflows. As untargeted metabolomics datasets and experiments continue to increase in size and complexity, standardized workflows are often not sufficiently sophisticated. In addition, the ground truth for untargeted metabolomics experiments is intrinsically unknown and the performance of tools is difficult to evaluate. Here, the problem of dynamic multi-class metabolomics experiments was investigated using a simulated dataset with a known ground truth. This simulated dataset was used to evaluate the performance of tinderesting, a new and intuitive tool based on gathering expert knowledge to be used in machine learning. The results were compared to EDGE, a statistical method for time series data. This paper presents three novel outcomes. The first is a way to simulate dynamic metabolomics data with a known ground truth based on ordinary differential equations. This method is made available through the MetaboLouise R package. Second, the EDGE tool, originally developed for genomics data analysis, is highly performant in analyzing dynamic case vs. control metabolomics data. Third, the tinderesting method is introduced to analyse more complex dynamic metabolomics experiments. This tool consists of a Shiny app for collecting expert knowledge, which in turn is used to train a machine learning model to emulate the decision process of the expert. This approach does not replace traditional data analysis workflows for metabolomics, but can provide additional information, improved performance or easier interpretation of results. The advantage is that the tool is agnostic to the complexity of the experiment, and thus is easier to use in advanced setups. All code for the presented analysis, MetaboLouise and tinderesting are freely available.
format Online
Article
Text
id pubmed-6468718
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64687182019-04-22 Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis Beirnaert, Charlie Peeters, Laura Meysman, Pieter Bittremieux, Wout Foubert, Kenn Custers, Deborah Van der Auwera, Anastasia Cuykx, Matthias Pieters, Luc Covaci, Adrian Laukens, Kris Metabolites Article Data analysis for metabolomics is undergoing rapid progress thanks to the proliferation of novel tools and the standardization of existing workflows. As untargeted metabolomics datasets and experiments continue to increase in size and complexity, standardized workflows are often not sufficiently sophisticated. In addition, the ground truth for untargeted metabolomics experiments is intrinsically unknown and the performance of tools is difficult to evaluate. Here, the problem of dynamic multi-class metabolomics experiments was investigated using a simulated dataset with a known ground truth. This simulated dataset was used to evaluate the performance of tinderesting, a new and intuitive tool based on gathering expert knowledge to be used in machine learning. The results were compared to EDGE, a statistical method for time series data. This paper presents three novel outcomes. The first is a way to simulate dynamic metabolomics data with a known ground truth based on ordinary differential equations. This method is made available through the MetaboLouise R package. Second, the EDGE tool, originally developed for genomics data analysis, is highly performant in analyzing dynamic case vs. control metabolomics data. Third, the tinderesting method is introduced to analyse more complex dynamic metabolomics experiments. This tool consists of a Shiny app for collecting expert knowledge, which in turn is used to train a machine learning model to emulate the decision process of the expert. This approach does not replace traditional data analysis workflows for metabolomics, but can provide additional information, improved performance or easier interpretation of results. The advantage is that the tool is agnostic to the complexity of the experiment, and thus is easier to use in advanced setups. All code for the presented analysis, MetaboLouise and tinderesting are freely available. MDPI 2019-03-20 /pmc/articles/PMC6468718/ /pubmed/30897797 http://dx.doi.org/10.3390/metabo9030054 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Beirnaert, Charlie
Peeters, Laura
Meysman, Pieter
Bittremieux, Wout
Foubert, Kenn
Custers, Deborah
Van der Auwera, Anastasia
Cuykx, Matthias
Pieters, Luc
Covaci, Adrian
Laukens, Kris
Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis
title Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis
title_full Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis
title_fullStr Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis
title_full_unstemmed Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis
title_short Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis
title_sort using expert driven machine learning to enhance dynamic metabolomics data analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6468718/
https://www.ncbi.nlm.nih.gov/pubmed/30897797
http://dx.doi.org/10.3390/metabo9030054
work_keys_str_mv AT beirnaertcharlie usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT peeterslaura usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT meysmanpieter usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT bittremieuxwout usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT foubertkenn usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT custersdeborah usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT vanderauweraanastasia usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT cuykxmatthias usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT pietersluc usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT covaciadrian usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis
AT laukenskris usingexpertdrivenmachinelearningtoenhancedynamicmetabolomicsdataanalysis