Cargando…

Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network

Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by b...

Descripción completa

Detalles Bibliográficos
Autores principales: Guan, Yanfei, Shree Sowndarya, S. V., Gallegos, Liliana C., St. John, Peter C., Paton, Robert S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457395/
https://www.ncbi.nlm.nih.gov/pubmed/34667567
http://dx.doi.org/10.1039/d1sc03343c
_version_ 1784571085881081856
author Guan, Yanfei
Shree Sowndarya, S. V.
Gallegos, Liliana C.
St. John, Peter C.
Paton, Robert S.
author_facet Guan, Yanfei
Shree Sowndarya, S. V.
Gallegos, Liliana C.
St. John, Peter C.
Paton, Robert S.
author_sort Guan, Yanfei
collection PubMed
description Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict (1)H and (13)C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed model predicts observed (13)C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution.
format Online
Article
Text
id pubmed-8457395
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-84573952021-10-18 Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network Guan, Yanfei Shree Sowndarya, S. V. Gallegos, Liliana C. St. John, Peter C. Paton, Robert S. Chem Sci Chemistry Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict (1)H and (13)C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed model predicts observed (13)C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution. The Royal Society of Chemistry 2021-08-09 /pmc/articles/PMC8457395/ /pubmed/34667567 http://dx.doi.org/10.1039/d1sc03343c Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle Chemistry
Guan, Yanfei
Shree Sowndarya, S. V.
Gallegos, Liliana C.
St. John, Peter C.
Paton, Robert S.
Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network
title Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network
title_full Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network
title_fullStr Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network
title_full_unstemmed Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network
title_short Real-time prediction of (1)H and (13)C chemical shifts with DFT accuracy using a 3D graph neural network
title_sort real-time prediction of (1)h and (13)c chemical shifts with dft accuracy using a 3d graph neural network
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457395/
https://www.ncbi.nlm.nih.gov/pubmed/34667567
http://dx.doi.org/10.1039/d1sc03343c
work_keys_str_mv AT guanyanfei realtimepredictionof1hand13cchemicalshiftswithdftaccuracyusinga3dgraphneuralnetwork
AT shreesowndaryasv realtimepredictionof1hand13cchemicalshiftswithdftaccuracyusinga3dgraphneuralnetwork
AT gallegoslilianac realtimepredictionof1hand13cchemicalshiftswithdftaccuracyusinga3dgraphneuralnetwork
AT stjohnpeterc realtimepredictionof1hand13cchemicalshiftswithdftaccuracyusinga3dgraphneuralnetwork
AT patonroberts realtimepredictionof1hand13cchemicalshiftswithdftaccuracyusinga3dgraphneuralnetwork