Cargando…

Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies

Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models curren...

Descripción completa

Detalles Bibliográficos
Autores principales: Jorner, Kjell, Brinck, Tore, Norrby, Per-Ola, Buttar, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9528810/
https://www.ncbi.nlm.nih.gov/pubmed/36299676
http://dx.doi.org/10.1039/d0sc04896h
_version_ 1784801367269834752
author Jorner, Kjell
Brinck, Tore
Norrby, Per-Ola
Buttar, David
author_facet Jorner, Kjell
Brinck, Tore
Norrby, Per-Ola
Buttar, David
author_sort Jorner, Kjell
collection PubMed
description Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol(−1) for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100–150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
format Online
Article
Text
id pubmed-9528810
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-95288102022-10-25 Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies Jorner, Kjell Brinck, Tore Norrby, Per-Ola Buttar, David Chem Sci Chemistry Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol(−1) for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100–150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints. The Royal Society of Chemistry 2020-11-05 /pmc/articles/PMC9528810/ /pubmed/36299676 http://dx.doi.org/10.1039/d0sc04896h Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle Chemistry
Jorner, Kjell
Brinck, Tore
Norrby, Per-Ola
Buttar, David
Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
title Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
title_full Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
title_fullStr Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
title_full_unstemmed Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
title_short Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
title_sort machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9528810/
https://www.ncbi.nlm.nih.gov/pubmed/36299676
http://dx.doi.org/10.1039/d0sc04896h
work_keys_str_mv AT jornerkjell machinelearningmeetsmechanisticmodellingforaccuratepredictionofexperimentalactivationenergies
AT brincktore machinelearningmeetsmechanisticmodellingforaccuratepredictionofexperimentalactivationenergies
AT norrbyperola machinelearningmeetsmechanisticmodellingforaccuratepredictionofexperimentalactivationenergies
AT buttardavid machinelearningmeetsmechanisticmodellingforaccuratepredictionofexperimentalactivationenergies