Cargando…

Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics

BACKGROUND: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in met...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Chen, Su, Kuan-Jui, Wu, Chong, Cao, Xuewei, Sha, Qiuying, Li, Wu, Luo, Zhe, Qin, Tian, Qiu, Chuan, Zhao, Lan Juan, Liu, Anqi, Jiang, Lindong, Zhang, Xiao, Shen, Hui, Zhou, Weihua, Deng, Hong-Wen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cornell University 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593076/
https://www.ncbi.nlm.nih.gov/pubmed/37873011
_version_ 1785124389568643072
author Zhao, Chen
Su, Kuan-Jui
Wu, Chong
Cao, Xuewei
Sha, Qiuying
Li, Wu
Luo, Zhe
Qin, Tian
Qiu, Chuan
Zhao, Lan Juan
Liu, Anqi
Jiang, Lindong
Zhang, Xiao
Shen, Hui
Zhou, Weihua
Deng, Hong-Wen
author_facet Zhao, Chen
Su, Kuan-Jui
Wu, Chong
Cao, Xuewei
Sha, Qiuying
Li, Wu
Luo, Zhe
Qin, Tian
Qiu, Chuan
Zhao, Lan Juan
Liu, Anqi
Jiang, Lindong
Zhang, Xiao
Shen, Hui
Zhou, Weihua
Deng, Hong-Wen
author_sort Zhao, Chen
collection PubMed
description BACKGROUND: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. METHOD: In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-view variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. RESULTS: We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved [Formula: see text]-scores > 0.01 for 71.55% of metabolites. CONCLUSION: The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.
format Online
Article
Text
id pubmed-10593076
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cornell University
record_format MEDLINE/PubMed
spelling pubmed-105930762023-10-24 Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics Zhao, Chen Su, Kuan-Jui Wu, Chong Cao, Xuewei Sha, Qiuying Li, Wu Luo, Zhe Qin, Tian Qiu, Chuan Zhao, Lan Juan Liu, Anqi Jiang, Lindong Zhang, Xiao Shen, Hui Zhou, Weihua Deng, Hong-Wen ArXiv Article BACKGROUND: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. METHOD: In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-view variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. RESULTS: We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved [Formula: see text]-scores > 0.01 for 71.55% of metabolites. CONCLUSION: The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research. Cornell University 2023-10-12 /pmc/articles/PMC10593076/ /pubmed/37873011 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Zhao, Chen
Su, Kuan-Jui
Wu, Chong
Cao, Xuewei
Sha, Qiuying
Li, Wu
Luo, Zhe
Qin, Tian
Qiu, Chuan
Zhao, Lan Juan
Liu, Anqi
Jiang, Lindong
Zhang, Xiao
Shen, Hui
Zhou, Weihua
Deng, Hong-Wen
Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
title Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
title_full Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
title_fullStr Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
title_full_unstemmed Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
title_short Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
title_sort multi-view variational autoencoder for missing value imputation in untargeted metabolomics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593076/
https://www.ncbi.nlm.nih.gov/pubmed/37873011
work_keys_str_mv AT zhaochen multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT sukuanjui multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT wuchong multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT caoxuewei multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT shaqiuying multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT liwu multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT luozhe multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT qintian multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT qiuchuan multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT zhaolanjuan multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT liuanqi multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT jianglindong multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT zhangxiao multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT shenhui multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT zhouweihua multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics
AT denghongwen multiviewvariationalautoencoderformissingvalueimputationinuntargetedmetabolomics