Cargando…
Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables
The untargeted metabolomics analysis of biological samples with nuclear magnetic resonance (NMR) provides highly complex data containing various signals from different molecules. To use these data for classification, e.g., in the context of food authentication, machine learning methods are used. The...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10608983/ https://www.ncbi.nlm.nih.gov/pubmed/37887402 http://dx.doi.org/10.3390/metabo13101075 |
_version_ | 1785127906619424768 |
---|---|
author | Wenck, Soeren Mix, Thorsten Fischer, Markus Hackl, Thomas Seifert, Stephan |
author_facet | Wenck, Soeren Mix, Thorsten Fischer, Markus Hackl, Thomas Seifert, Stephan |
author_sort | Wenck, Soeren |
collection | PubMed |
description | The untargeted metabolomics analysis of biological samples with nuclear magnetic resonance (NMR) provides highly complex data containing various signals from different molecules. To use these data for classification, e.g., in the context of food authentication, machine learning methods are used. These methods are usually applied as a black box, which means that no information about the complex relationships between the variables and the outcome is obtained. In this study, we show that the random forest-based approach surrogate minimal depth (SMD) can be applied for a comprehensive analysis of class-specific differences by selecting relevant variables and analyzing their mutual impact on the classification model of different truffle species. SMD allows the assignment of variables from the same metabolites as well as the detection of interactions between different metabolites that can be attributed to known biological relationships. |
format | Online Article Text |
id | pubmed-10608983 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-106089832023-10-28 Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables Wenck, Soeren Mix, Thorsten Fischer, Markus Hackl, Thomas Seifert, Stephan Metabolites Article The untargeted metabolomics analysis of biological samples with nuclear magnetic resonance (NMR) provides highly complex data containing various signals from different molecules. To use these data for classification, e.g., in the context of food authentication, machine learning methods are used. These methods are usually applied as a black box, which means that no information about the complex relationships between the variables and the outcome is obtained. In this study, we show that the random forest-based approach surrogate minimal depth (SMD) can be applied for a comprehensive analysis of class-specific differences by selecting relevant variables and analyzing their mutual impact on the classification model of different truffle species. SMD allows the assignment of variables from the same metabolites as well as the detection of interactions between different metabolites that can be attributed to known biological relationships. MDPI 2023-10-13 /pmc/articles/PMC10608983/ /pubmed/37887402 http://dx.doi.org/10.3390/metabo13101075 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wenck, Soeren Mix, Thorsten Fischer, Markus Hackl, Thomas Seifert, Stephan Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables |
title | Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables |
title_full | Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables |
title_fullStr | Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables |
title_full_unstemmed | Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables |
title_short | Opening the Random Forest Black Box of (1)H NMR Metabolomics Data by the Exploitation of Surrogate Variables |
title_sort | opening the random forest black box of (1)h nmr metabolomics data by the exploitation of surrogate variables |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10608983/ https://www.ncbi.nlm.nih.gov/pubmed/37887402 http://dx.doi.org/10.3390/metabo13101075 |
work_keys_str_mv | AT wencksoeren openingtherandomforestblackboxof1hnmrmetabolomicsdatabytheexploitationofsurrogatevariables AT mixthorsten openingtherandomforestblackboxof1hnmrmetabolomicsdatabytheexploitationofsurrogatevariables AT fischermarkus openingtherandomforestblackboxof1hnmrmetabolomicsdatabytheexploitationofsurrogatevariables AT hacklthomas openingtherandomforestblackboxof1hnmrmetabolomicsdatabytheexploitationofsurrogatevariables AT seifertstephan openingtherandomforestblackboxof1hnmrmetabolomicsdatabytheexploitationofsurrogatevariables |