Cargando…

Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction

[Image: see text] Pd-catalyzed C–N couplings are commonplace in academia and industry. Despite their significance, finding suitable reaction conditions leading to a high yield, for instance, remains a challenging and time-consuming task which usually requires screening over many sets of conditions....

Descripción completa

Detalles Bibliográficos
Autores principales:	Fitzner, Martin, Wuitschik, Georg, Koller, Raffael, Adam, Jean-Michel, Schindler, Torsten
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	American Chemical Society 2023
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9878668/ https://www.ncbi.nlm.nih.gov/pubmed/36713686 http://dx.doi.org/10.1021/acsomega.2c05546

_version_	1784878536773861376
author	Fitzner, Martin Wuitschik, Georg Koller, Raffael Adam, Jean-Michel Schindler, Torsten
author_facet	Fitzner, Martin Wuitschik, Georg Koller, Raffael Adam, Jean-Michel Schindler, Torsten
author_sort	Fitzner, Martin
collection	PubMed
description	[Image: see text] Pd-catalyzed C–N couplings are commonplace in academia and industry. Despite their significance, finding suitable reaction conditions leading to a high yield, for instance, remains a challenging and time-consuming task which usually requires screening over many sets of conditions. To help select promising reaction conditions in the vast space of reagent combinations, machine learning is an emerging technique with a lot of promise. In this work, we assess whether the reaction yield of C–N couplings can be predicted from databases of chemical reactions. We test the generalizability of models both on challenging data splits and on a dedicated experimental test set. We find that, provided the chemical space represented by the training set is not left, the models perform well. However, the applicability domain is quickly left even for simple reactions of the same type, as, for instance, present in our plate test set. The results show that yield prediction for new reactions is possible from the algorithmic side but in practice is hindered by the available data. Most importantly, more data that cover the diversity in reagents are needed for a general-purpose prediction of reaction yields. Our findings also expose a challenge to this field in that it appears to be extremely deceiving to judge models based on literature data with test sets which are split off the same literature data, even when challenging splits are considered.
format	Online Article Text
id	pubmed-9878668
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	American Chemical Society
record_format	MEDLINE/PubMed
spelling	pubmed-98786682023-01-27 Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction Fitzner, Martin Wuitschik, Georg Koller, Raffael Adam, Jean-Michel Schindler, Torsten ACS Omega [Image: see text] Pd-catalyzed C–N couplings are commonplace in academia and industry. Despite their significance, finding suitable reaction conditions leading to a high yield, for instance, remains a challenging and time-consuming task which usually requires screening over many sets of conditions. To help select promising reaction conditions in the vast space of reagent combinations, machine learning is an emerging technique with a lot of promise. In this work, we assess whether the reaction yield of C–N couplings can be predicted from databases of chemical reactions. We test the generalizability of models both on challenging data splits and on a dedicated experimental test set. We find that, provided the chemical space represented by the training set is not left, the models perform well. However, the applicability domain is quickly left even for simple reactions of the same type, as, for instance, present in our plate test set. The results show that yield prediction for new reactions is possible from the algorithmic side but in practice is hindered by the available data. Most importantly, more data that cover the diversity in reagents are needed for a general-purpose prediction of reaction yields. Our findings also expose a challenge to this field in that it appears to be extremely deceiving to judge models based on literature data with test sets which are split off the same literature data, even when challenging splits are considered. American Chemical Society 2023-01-11 /pmc/articles/PMC9878668/ /pubmed/36713686 http://dx.doi.org/10.1021/acsomega.2c05546 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle	Fitzner, Martin Wuitschik, Georg Koller, Raffael Adam, Jean-Michel Schindler, Torsten Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
title	Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
title_full	Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
title_fullStr	Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
title_full_unstemmed	Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
title_short	Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
title_sort	machine learning c–n couplings: obstacles for a general-purpose reaction yield prediction
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9878668/ https://www.ncbi.nlm.nih.gov/pubmed/36713686 http://dx.doi.org/10.1021/acsomega.2c05546
work_keys_str_mv	AT fitznermartin machinelearningcncouplingsobstaclesforageneralpurposereactionyieldprediction AT wuitschikgeorg machinelearningcncouplingsobstaclesforageneralpurposereactionyieldprediction AT kollerraffael machinelearningcncouplingsobstaclesforageneralpurposereactionyieldprediction AT adamjeanmichel machinelearningcncouplingsobstaclesforageneralpurposereactionyieldprediction AT schindlertorsten machinelearningcncouplingsobstaclesforageneralpurposereactionyieldprediction

Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction

Ejemplares similares