Cargando…

Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation

Metabolite annotation has been a challenging issue especially in untargeted metabolomics studies by liquid chromatography coupled with mass spectrometry (LC-MS). This is in part due to the limitations of publicly available spectral libraries, which consist of tandem mass spectrometry (MS/MS) data ac...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Shijinqiu, Chau, Hoi Yan Katharine, Wang, Kuijun, Ao, Hongyu, Varghese, Rency S., Ressom, Habtom W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9316655/
https://www.ncbi.nlm.nih.gov/pubmed/35888729
http://dx.doi.org/10.3390/metabo12070605
_version_ 1784754868599128064
author Gao, Shijinqiu
Chau, Hoi Yan Katharine
Wang, Kuijun
Ao, Hongyu
Varghese, Rency S.
Ressom, Habtom W.
author_facet Gao, Shijinqiu
Chau, Hoi Yan Katharine
Wang, Kuijun
Ao, Hongyu
Varghese, Rency S.
Ressom, Habtom W.
author_sort Gao, Shijinqiu
collection PubMed
description Metabolite annotation has been a challenging issue especially in untargeted metabolomics studies by liquid chromatography coupled with mass spectrometry (LC-MS). This is in part due to the limitations of publicly available spectral libraries, which consist of tandem mass spectrometry (MS/MS) data acquired from just a fraction of known metabolites. Machine learning provides the opportunity to predict molecular fingerprints based on MS/MS data. The predicted molecular fingerprints can then be used to help rank putative metabolite IDs obtained by using either the precursor mass or the formula of the unknown metabolite. This method is particularly useful to help annotate metabolites whose corresponding MS/MS spectra are missing or cannot be matched with those in accessible spectral libraries. We investigated a convolutional neural network (CNN) for molecular fingerprint prediction based on data acquired by MS/MS. We used more than 680,000 MS/MS spectra obtained from the MoNA repository and NIST 20, representing about 36,000 compounds for training and testing our CNN model. The trained CNN model is implemented as a python package, MetFID. The package is available on GitHub for users to enter their MS/MS spectra and corresponding putative metabolite IDs to obtain ranked lists of metabolites. Better performance is achieved by MetFID in ranking putative metabolite IDs using the CASMI 2016 benchmark dataset compared to two other machine learning-based tools (CSI:FingerID and ChemDistiller).
format Online
Article
Text
id pubmed-9316655
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93166552022-07-27 Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation Gao, Shijinqiu Chau, Hoi Yan Katharine Wang, Kuijun Ao, Hongyu Varghese, Rency S. Ressom, Habtom W. Metabolites Article Metabolite annotation has been a challenging issue especially in untargeted metabolomics studies by liquid chromatography coupled with mass spectrometry (LC-MS). This is in part due to the limitations of publicly available spectral libraries, which consist of tandem mass spectrometry (MS/MS) data acquired from just a fraction of known metabolites. Machine learning provides the opportunity to predict molecular fingerprints based on MS/MS data. The predicted molecular fingerprints can then be used to help rank putative metabolite IDs obtained by using either the precursor mass or the formula of the unknown metabolite. This method is particularly useful to help annotate metabolites whose corresponding MS/MS spectra are missing or cannot be matched with those in accessible spectral libraries. We investigated a convolutional neural network (CNN) for molecular fingerprint prediction based on data acquired by MS/MS. We used more than 680,000 MS/MS spectra obtained from the MoNA repository and NIST 20, representing about 36,000 compounds for training and testing our CNN model. The trained CNN model is implemented as a python package, MetFID. The package is available on GitHub for users to enter their MS/MS spectra and corresponding putative metabolite IDs to obtain ranked lists of metabolites. Better performance is achieved by MetFID in ranking putative metabolite IDs using the CASMI 2016 benchmark dataset compared to two other machine learning-based tools (CSI:FingerID and ChemDistiller). MDPI 2022-06-29 /pmc/articles/PMC9316655/ /pubmed/35888729 http://dx.doi.org/10.3390/metabo12070605 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gao, Shijinqiu
Chau, Hoi Yan Katharine
Wang, Kuijun
Ao, Hongyu
Varghese, Rency S.
Ressom, Habtom W.
Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation
title Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation
title_full Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation
title_fullStr Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation
title_full_unstemmed Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation
title_short Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation
title_sort convolutional neural network-based compound fingerprint prediction for metabolite annotation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9316655/
https://www.ncbi.nlm.nih.gov/pubmed/35888729
http://dx.doi.org/10.3390/metabo12070605
work_keys_str_mv AT gaoshijinqiu convolutionalneuralnetworkbasedcompoundfingerprintpredictionformetaboliteannotation
AT chauhoiyankatharine convolutionalneuralnetworkbasedcompoundfingerprintpredictionformetaboliteannotation
AT wangkuijun convolutionalneuralnetworkbasedcompoundfingerprintpredictionformetaboliteannotation
AT aohongyu convolutionalneuralnetworkbasedcompoundfingerprintpredictionformetaboliteannotation
AT vargheserencys convolutionalneuralnetworkbasedcompoundfingerprintpredictionformetaboliteannotation
AT ressomhabtomw convolutionalneuralnetworkbasedcompoundfingerprintpredictionformetaboliteannotation