Cargando…

Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search

Computer-aided synthesis planning (CASP) aims to automatically learn organic reactivity from literature and perform retrosynthesis of unseen molecules. CASP systems must learn reactions sufficiently precisely to propose realistic disconnections, while avoiding overfitting to leave room for diverse o...

Descripción completa

Detalles Bibliográficos
Autores principales: Kreutter, David, Reymond, Jean-Louis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510629/
https://www.ncbi.nlm.nih.gov/pubmed/37736648
http://dx.doi.org/10.1039/d3sc01604h
_version_ 1785107994985365504
author Kreutter, David
Reymond, Jean-Louis
author_facet Kreutter, David
Reymond, Jean-Louis
author_sort Kreutter, David
collection PubMed
description Computer-aided synthesis planning (CASP) aims to automatically learn organic reactivity from literature and perform retrosynthesis of unseen molecules. CASP systems must learn reactions sufficiently precisely to propose realistic disconnections, while avoiding overfitting to leave room for diverse options, and explore possible routes such as to allow short synthetic sequences to emerge. Herein we report an open-source CASP tool proposing original solutions to both challenges. First, we use a triple transformer loop (TTL) predicting starting materials (T1), reagents (T2), and products (T3) to explore various disconnection sites defined by combining systematic, template-based, and transformer-based tagging procedures. Second, we integrate TTL into a multistep tree search algorithm (TTLA) prioritizing sequences using a route penalty score (RPScore) considering the number of steps, their confidence score, and the simplicity of all intermediates along the route. Our approach favours short synthetic routes to commercial starting materials, as exemplified by retrosynthetic analyses of recently approved drugs.
format Online
Article
Text
id pubmed-10510629
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-105106292023-09-21 Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search Kreutter, David Reymond, Jean-Louis Chem Sci Chemistry Computer-aided synthesis planning (CASP) aims to automatically learn organic reactivity from literature and perform retrosynthesis of unseen molecules. CASP systems must learn reactions sufficiently precisely to propose realistic disconnections, while avoiding overfitting to leave room for diverse options, and explore possible routes such as to allow short synthetic sequences to emerge. Herein we report an open-source CASP tool proposing original solutions to both challenges. First, we use a triple transformer loop (TTL) predicting starting materials (T1), reagents (T2), and products (T3) to explore various disconnection sites defined by combining systematic, template-based, and transformer-based tagging procedures. Second, we integrate TTL into a multistep tree search algorithm (TTLA) prioritizing sequences using a route penalty score (RPScore) considering the number of steps, their confidence score, and the simplicity of all intermediates along the route. Our approach favours short synthetic routes to commercial starting materials, as exemplified by retrosynthetic analyses of recently approved drugs. The Royal Society of Chemistry 2023-09-01 /pmc/articles/PMC10510629/ /pubmed/37736648 http://dx.doi.org/10.1039/d3sc01604h Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle Chemistry
Kreutter, David
Reymond, Jean-Louis
Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
title Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
title_full Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
title_fullStr Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
title_full_unstemmed Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
title_short Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
title_sort multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510629/
https://www.ncbi.nlm.nih.gov/pubmed/37736648
http://dx.doi.org/10.1039/d3sc01604h
work_keys_str_mv AT kreutterdavid multistepretrosynthesiscombiningadisconnectionawaretripletransformerloopwitharoutepenaltyscoreguidedtreesearch
AT reymondjeanlouis multistepretrosynthesiscombiningadisconnectionawaretripletransformerloopwitharoutepenaltyscoreguidedtreesearch