Cargando…
Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry
Large-scale metabolomics assays are widely used in epidemiology for biomarker discovery and risk assessments. However, systematic errors introduced by instrumental signal drifting pose a big challenge in large-scale assays, especially for derivatization-based gas chromatography–mass spectrometry (GC...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10456436/ https://www.ncbi.nlm.nih.gov/pubmed/37623887 http://dx.doi.org/10.3390/metabo13080944 |
_version_ | 1785096698363641856 |
---|---|
author | Zhang, Ying Fan, Sili Wohlgemuth, Gert Fiehn, Oliver |
author_facet | Zhang, Ying Fan, Sili Wohlgemuth, Gert Fiehn, Oliver |
author_sort | Zhang, Ying |
collection | PubMed |
description | Large-scale metabolomics assays are widely used in epidemiology for biomarker discovery and risk assessments. However, systematic errors introduced by instrumental signal drifting pose a big challenge in large-scale assays, especially for derivatization-based gas chromatography–mass spectrometry (GC–MS). Here, we compare the results of different normalization methods for a study with more than 4000 human plasma samples involved in a type 2 diabetes cohort study, in addition to 413 pooled quality control (QC) samples, 413 commercial pooled plasma samples, and a set of 25 stable isotope-labeled internal standards used for every sample. Data acquisition was conducted across 1.2 years, including seven column changes. In total, 413 pooled QC (training) and 413 BioIVT samples (validation) were used for normalization comparisons. Surprisingly, neither internal standards nor sum-based normalizations yielded median precision of less than 30% across all 563 metabolite annotations. While the machine-learning-based SERRF algorithm gave 19% median precision based on the pooled quality control samples, external cross-validation with BioIVT plasma pools yielded a median 34% relative standard deviation (RSD). We developed a new method: systematic error reduction by denoising autoencoder (SERDA). SERDA lowered the median standard deviations of the training QC samples down to 16% RSD, yielding an overall error of 19% RSD when applied to the independent BioIVT validation QC samples. This is the largest study on GC–MS metabolomics ever reported, demonstrating that technical errors can be normalized and handled effectively for this assay. SERDA was further validated on two additional large-scale GC–MS-based human plasma metabolomics studies, confirming the superior performance of SERDA over SERRF or sum normalizations. |
format | Online Article Text |
id | pubmed-10456436 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104564362023-08-26 Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry Zhang, Ying Fan, Sili Wohlgemuth, Gert Fiehn, Oliver Metabolites Article Large-scale metabolomics assays are widely used in epidemiology for biomarker discovery and risk assessments. However, systematic errors introduced by instrumental signal drifting pose a big challenge in large-scale assays, especially for derivatization-based gas chromatography–mass spectrometry (GC–MS). Here, we compare the results of different normalization methods for a study with more than 4000 human plasma samples involved in a type 2 diabetes cohort study, in addition to 413 pooled quality control (QC) samples, 413 commercial pooled plasma samples, and a set of 25 stable isotope-labeled internal standards used for every sample. Data acquisition was conducted across 1.2 years, including seven column changes. In total, 413 pooled QC (training) and 413 BioIVT samples (validation) were used for normalization comparisons. Surprisingly, neither internal standards nor sum-based normalizations yielded median precision of less than 30% across all 563 metabolite annotations. While the machine-learning-based SERRF algorithm gave 19% median precision based on the pooled quality control samples, external cross-validation with BioIVT plasma pools yielded a median 34% relative standard deviation (RSD). We developed a new method: systematic error reduction by denoising autoencoder (SERDA). SERDA lowered the median standard deviations of the training QC samples down to 16% RSD, yielding an overall error of 19% RSD when applied to the independent BioIVT validation QC samples. This is the largest study on GC–MS metabolomics ever reported, demonstrating that technical errors can be normalized and handled effectively for this assay. SERDA was further validated on two additional large-scale GC–MS-based human plasma metabolomics studies, confirming the superior performance of SERDA over SERRF or sum normalizations. MDPI 2023-08-13 /pmc/articles/PMC10456436/ /pubmed/37623887 http://dx.doi.org/10.3390/metabo13080944 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhang, Ying Fan, Sili Wohlgemuth, Gert Fiehn, Oliver Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry |
title | Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry |
title_full | Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry |
title_fullStr | Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry |
title_full_unstemmed | Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry |
title_short | Denoising Autoencoder Normalization for Large-Scale Untargeted Metabolomics by Gas Chromatography–Mass Spectrometry |
title_sort | denoising autoencoder normalization for large-scale untargeted metabolomics by gas chromatography–mass spectrometry |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10456436/ https://www.ncbi.nlm.nih.gov/pubmed/37623887 http://dx.doi.org/10.3390/metabo13080944 |
work_keys_str_mv | AT zhangying denoisingautoencodernormalizationforlargescaleuntargetedmetabolomicsbygaschromatographymassspectrometry AT fansili denoisingautoencodernormalizationforlargescaleuntargetedmetabolomicsbygaschromatographymassspectrometry AT wohlgemuthgert denoisingautoencodernormalizationforlargescaleuntargetedmetabolomicsbygaschromatographymassspectrometry AT fiehnoliver denoisingautoencodernormalizationforlargescaleuntargetedmetabolomicsbygaschromatographymassspectrometry |