Cargando…

Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures

Nuclear magnetic resonance (NMR) spectroscopy is highly unbiased and reproducible, which provides us a powerful tool to analyze mixtures consisting of small molecules. However, the compound identification in NMR spectra of mixtures is highly challenging because of chemical shift variations of the sa...

Descripción completa

Detalles Bibliográficos
Autores principales: Wei, Weiwei, Liao, Yuxuan, Wang, Yufei, Wang, Shaoqi, Du, Wen, Lu, Hongmei, Kong, Bo, Yang, Huawu, Zhang, Zhimin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9227391/
https://www.ncbi.nlm.nih.gov/pubmed/35744782
http://dx.doi.org/10.3390/molecules27123653
_version_ 1784734164457619456
author Wei, Weiwei
Liao, Yuxuan
Wang, Yufei
Wang, Shaoqi
Du, Wen
Lu, Hongmei
Kong, Bo
Yang, Huawu
Zhang, Zhimin
author_facet Wei, Weiwei
Liao, Yuxuan
Wang, Yufei
Wang, Shaoqi
Du, Wen
Lu, Hongmei
Kong, Bo
Yang, Huawu
Zhang, Zhimin
author_sort Wei, Weiwei
collection PubMed
description Nuclear magnetic resonance (NMR) spectroscopy is highly unbiased and reproducible, which provides us a powerful tool to analyze mixtures consisting of small molecules. However, the compound identification in NMR spectra of mixtures is highly challenging because of chemical shift variations of the same compound in different mixtures and peak overlapping among molecules. Here, we present a pseudo-Siamese convolutional neural network method (pSCNN) to identify compounds in mixtures for NMR spectroscopy. A data augmentation method was implemented for the superposition of several NMR spectra sampled from a spectral database with random noises. The augmented dataset was split and used to train, validate and test the pSCNN model. Two experimental NMR datasets (flavor mixtures and additional flavor mixture) were acquired to benchmark its performance in real applications. The results show that the proposed method can achieve good performances in the augmented test set (ACC = 99.80%, TPR = 99.70% and FPR = 0.10%), the flavor mixtures dataset (ACC = 97.62%, TPR = 96.44% and FPR = 2.29%) and the additional flavor mixture dataset (ACC = 91.67%, TPR = 100.00% and FPR = 10.53%). We have demonstrated that the translational invariance of convolutional neural networks can solve the chemical shift variation problem in NMR spectra. In summary, pSCNN is an off-the-shelf method to identify compounds in mixtures for NMR spectroscopy because of its accuracy in compound identification and robustness to chemical shift variation.
format Online
Article
Text
id pubmed-9227391
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92273912022-06-25 Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures Wei, Weiwei Liao, Yuxuan Wang, Yufei Wang, Shaoqi Du, Wen Lu, Hongmei Kong, Bo Yang, Huawu Zhang, Zhimin Molecules Article Nuclear magnetic resonance (NMR) spectroscopy is highly unbiased and reproducible, which provides us a powerful tool to analyze mixtures consisting of small molecules. However, the compound identification in NMR spectra of mixtures is highly challenging because of chemical shift variations of the same compound in different mixtures and peak overlapping among molecules. Here, we present a pseudo-Siamese convolutional neural network method (pSCNN) to identify compounds in mixtures for NMR spectroscopy. A data augmentation method was implemented for the superposition of several NMR spectra sampled from a spectral database with random noises. The augmented dataset was split and used to train, validate and test the pSCNN model. Two experimental NMR datasets (flavor mixtures and additional flavor mixture) were acquired to benchmark its performance in real applications. The results show that the proposed method can achieve good performances in the augmented test set (ACC = 99.80%, TPR = 99.70% and FPR = 0.10%), the flavor mixtures dataset (ACC = 97.62%, TPR = 96.44% and FPR = 2.29%) and the additional flavor mixture dataset (ACC = 91.67%, TPR = 100.00% and FPR = 10.53%). We have demonstrated that the translational invariance of convolutional neural networks can solve the chemical shift variation problem in NMR spectra. In summary, pSCNN is an off-the-shelf method to identify compounds in mixtures for NMR spectroscopy because of its accuracy in compound identification and robustness to chemical shift variation. MDPI 2022-06-07 /pmc/articles/PMC9227391/ /pubmed/35744782 http://dx.doi.org/10.3390/molecules27123653 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wei, Weiwei
Liao, Yuxuan
Wang, Yufei
Wang, Shaoqi
Du, Wen
Lu, Hongmei
Kong, Bo
Yang, Huawu
Zhang, Zhimin
Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures
title Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures
title_full Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures
title_fullStr Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures
title_full_unstemmed Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures
title_short Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures
title_sort deep learning-based method for compound identification in nmr spectra of mixtures
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9227391/
https://www.ncbi.nlm.nih.gov/pubmed/35744782
http://dx.doi.org/10.3390/molecules27123653
work_keys_str_mv AT weiweiwei deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT liaoyuxuan deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT wangyufei deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT wangshaoqi deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT duwen deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT luhongmei deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT kongbo deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT yanghuawu deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures
AT zhangzhimin deeplearningbasedmethodforcompoundidentificationinnmrspectraofmixtures