Cargando…

Prediction of Organic Reaction Outcomes Using Machine Learning

[Image: see text] Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted...

Descripción completa

Detalles Bibliográficos
Autores principales: Coley, Connor W., Barzilay, Regina, Jaakkola, Tommi S., Green, William H., Jensen, Klavs F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2017
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5445544/
https://www.ncbi.nlm.nih.gov/pubmed/28573205
http://dx.doi.org/10.1021/acscentsci.7b00064
_version_ 1783238916517658624
author Coley, Connor W.
Barzilay, Regina
Jaakkola, Tommi S.
Green, William H.
Jensen, Klavs F.
author_facet Coley, Connor W.
Barzilay, Regina
Jaakkola, Tommi S.
Green, William H.
Jensen, Klavs F.
author_sort Coley, Connor W.
collection PubMed
description [Image: see text] Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules’ overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank ≤3 in 86.7% of cases, and rank ≤5 in 90.8% of cases.
format Online
Article
Text
id pubmed-5445544
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-54455442017-06-01 Prediction of Organic Reaction Outcomes Using Machine Learning Coley, Connor W. Barzilay, Regina Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. ACS Cent Sci [Image: see text] Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules’ overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank ≤3 in 86.7% of cases, and rank ≤5 in 90.8% of cases. American Chemical Society 2017-04-18 2017-05-24 /pmc/articles/PMC5445544/ /pubmed/28573205 http://dx.doi.org/10.1021/acscentsci.7b00064 Text en Copyright © 2017 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes.
spellingShingle Coley, Connor W.
Barzilay, Regina
Jaakkola, Tommi S.
Green, William H.
Jensen, Klavs F.
Prediction of Organic Reaction Outcomes Using Machine Learning
title Prediction of Organic Reaction Outcomes Using Machine Learning
title_full Prediction of Organic Reaction Outcomes Using Machine Learning
title_fullStr Prediction of Organic Reaction Outcomes Using Machine Learning
title_full_unstemmed Prediction of Organic Reaction Outcomes Using Machine Learning
title_short Prediction of Organic Reaction Outcomes Using Machine Learning
title_sort prediction of organic reaction outcomes using machine learning
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5445544/
https://www.ncbi.nlm.nih.gov/pubmed/28573205
http://dx.doi.org/10.1021/acscentsci.7b00064
work_keys_str_mv AT coleyconnorw predictionoforganicreactionoutcomesusingmachinelearning
AT barzilayregina predictionoforganicreactionoutcomesusingmachinelearning
AT jaakkolatommis predictionoforganicreactionoutcomesusingmachinelearning
AT greenwilliamh predictionoforganicreactionoutcomesusingmachinelearning
AT jensenklavsf predictionoforganicreactionoutcomesusingmachinelearning