Cargando…

Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis

Rational solvent selection remains a significant challenge in process development. Here we describe a hybrid mechanistic-machine learning approach, geared towards automated process development workflow. A library of 459 solvents was used, for which 12 conventional molecular descriptors, two reaction...

Descripción completa

Detalles Bibliográficos
Autores principales: Amar, Yehia, Schweidtmann, Artur M., Deutsch, Paul, Cao, Liwei, Lapkin, Alexei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Royal Society of Chemistry 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6625492/
https://www.ncbi.nlm.nih.gov/pubmed/31367324
http://dx.doi.org/10.1039/c9sc01844a
_version_ 1783434428532391936
author Amar, Yehia
Schweidtmann, Artur M.
Deutsch, Paul
Cao, Liwei
Lapkin, Alexei
author_facet Amar, Yehia
Schweidtmann, Artur M.
Deutsch, Paul
Cao, Liwei
Lapkin, Alexei
author_sort Amar, Yehia
collection PubMed
description Rational solvent selection remains a significant challenge in process development. Here we describe a hybrid mechanistic-machine learning approach, geared towards automated process development workflow. A library of 459 solvents was used, for which 12 conventional molecular descriptors, two reaction-specific descriptors, and additional descriptors based on screening charge density, were calculated. Gaussian process surrogate models were trained on experimental data from a Rh(CO)(2)(acac)/Josiphos catalysed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam. With two simultaneous objectives – high conversion and high diastereomeric excess – the multi-objective algorithm, trained on the initial dataset of 25 solvents, has identified solvents leading to better reaction outcomes. In addition to being a powerful design of experiments (DoE) methodology, the resulting Gaussian process surrogate model for conversion is, in statistical terms, predictive, with a cross-validation correlation coefficient of 0.84. After identifying promising solvents, the composition of solvent mixtures and optimal reaction temperature were found using a black-box Bayesian optimisation. We then demonstrated the application of a new genetic programming approach to select an appropriate machine learning model for a specific physical system, which should allow the transition of the overall process development workflow into the future robotic laboratories.
format Online
Article
Text
id pubmed-6625492
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-66254922019-07-31 Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis Amar, Yehia Schweidtmann, Artur M. Deutsch, Paul Cao, Liwei Lapkin, Alexei Chem Sci Chemistry Rational solvent selection remains a significant challenge in process development. Here we describe a hybrid mechanistic-machine learning approach, geared towards automated process development workflow. A library of 459 solvents was used, for which 12 conventional molecular descriptors, two reaction-specific descriptors, and additional descriptors based on screening charge density, were calculated. Gaussian process surrogate models were trained on experimental data from a Rh(CO)(2)(acac)/Josiphos catalysed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam. With two simultaneous objectives – high conversion and high diastereomeric excess – the multi-objective algorithm, trained on the initial dataset of 25 solvents, has identified solvents leading to better reaction outcomes. In addition to being a powerful design of experiments (DoE) methodology, the resulting Gaussian process surrogate model for conversion is, in statistical terms, predictive, with a cross-validation correlation coefficient of 0.84. After identifying promising solvents, the composition of solvent mixtures and optimal reaction temperature were found using a black-box Bayesian optimisation. We then demonstrated the application of a new genetic programming approach to select an appropriate machine learning model for a specific physical system, which should allow the transition of the overall process development workflow into the future robotic laboratories. Royal Society of Chemistry 2019-05-30 /pmc/articles/PMC6625492/ /pubmed/31367324 http://dx.doi.org/10.1039/c9sc01844a Text en This journal is © The Royal Society of Chemistry 2019 http://creativecommons.org/licenses/by/3.0/ This article is freely available. This article is licensed under a Creative Commons Attribution 3.0 Unported Licence (CC BY 3.0)
spellingShingle Chemistry
Amar, Yehia
Schweidtmann, Artur M.
Deutsch, Paul
Cao, Liwei
Lapkin, Alexei
Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
title Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
title_full Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
title_fullStr Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
title_full_unstemmed Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
title_short Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
title_sort machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6625492/
https://www.ncbi.nlm.nih.gov/pubmed/31367324
http://dx.doi.org/10.1039/c9sc01844a
work_keys_str_mv AT amaryehia machinelearningandmoleculardescriptorsenablerationalsolventselectioninasymmetriccatalysis
AT schweidtmannarturm machinelearningandmoleculardescriptorsenablerationalsolventselectioninasymmetriccatalysis
AT deutschpaul machinelearningandmoleculardescriptorsenablerationalsolventselectioninasymmetriccatalysis
AT caoliwei machinelearningandmoleculardescriptorsenablerationalsolventselectioninasymmetriccatalysis
AT lapkinalexei machinelearningandmoleculardescriptorsenablerationalsolventselectioninasymmetriccatalysis