Cargando…
Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph
With the increasing application of deep-learning-based generative models for de novo molecule design, the quantitative estimation of molecular synthetic accessibility (SA) has become a crucial factor for prioritizing the structures generated from generative models. It is also useful for helping in t...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8838603/ https://www.ncbi.nlm.nih.gov/pubmed/35164303 http://dx.doi.org/10.3390/molecules27031039 |
_version_ | 1784650167394238464 |
---|---|
author | Li, Baiqing Chen, Hongming |
author_facet | Li, Baiqing Chen, Hongming |
author_sort | Li, Baiqing |
collection | PubMed |
description | With the increasing application of deep-learning-based generative models for de novo molecule design, the quantitative estimation of molecular synthetic accessibility (SA) has become a crucial factor for prioritizing the structures generated from generative models. It is also useful for helping in the prioritization of hit/lead compounds and guiding retrosynthesis analysis. In this study, based on the USPTO and Pistachio reaction datasets, a chemical reaction network was constructed for the identification of the shortest reaction paths (SRP) needed to synthesize compounds, and different SRP cut-offs were then used as the threshold to distinguish a organic compound as either an easy-to-synthesize (ES) or hard-to-synthesize (HS) class. Two synthesis accessibility models (DNN-ECFP model and graph-based CMPNN model) were built using deep learning/machine learning algorithms. Compared to other existing synthesis accessibility scoring schemes, such as SYBA, SCScore, and SAScore, our results show that CMPNN (ROC AUC: 0.791) performs better than SYBA (ROC AUC: 0.76), albeit marginally, and outperforms SAScore and SCScore. Our prediction models based on historical reaction knowledge could be a potential tool for estimating molecule SA. |
format | Online Article Text |
id | pubmed-8838603 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-88386032022-02-13 Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph Li, Baiqing Chen, Hongming Molecules Article With the increasing application of deep-learning-based generative models for de novo molecule design, the quantitative estimation of molecular synthetic accessibility (SA) has become a crucial factor for prioritizing the structures generated from generative models. It is also useful for helping in the prioritization of hit/lead compounds and guiding retrosynthesis analysis. In this study, based on the USPTO and Pistachio reaction datasets, a chemical reaction network was constructed for the identification of the shortest reaction paths (SRP) needed to synthesize compounds, and different SRP cut-offs were then used as the threshold to distinguish a organic compound as either an easy-to-synthesize (ES) or hard-to-synthesize (HS) class. Two synthesis accessibility models (DNN-ECFP model and graph-based CMPNN model) were built using deep learning/machine learning algorithms. Compared to other existing synthesis accessibility scoring schemes, such as SYBA, SCScore, and SAScore, our results show that CMPNN (ROC AUC: 0.791) performs better than SYBA (ROC AUC: 0.76), albeit marginally, and outperforms SAScore and SCScore. Our prediction models based on historical reaction knowledge could be a potential tool for estimating molecule SA. MDPI 2022-02-03 /pmc/articles/PMC8838603/ /pubmed/35164303 http://dx.doi.org/10.3390/molecules27031039 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Li, Baiqing Chen, Hongming Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph |
title | Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph |
title_full | Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph |
title_fullStr | Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph |
title_full_unstemmed | Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph |
title_short | Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph |
title_sort | prediction of compound synthesis accessibility based on reaction knowledge graph |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8838603/ https://www.ncbi.nlm.nih.gov/pubmed/35164303 http://dx.doi.org/10.3390/molecules27031039 |
work_keys_str_mv | AT libaiqing predictionofcompoundsynthesisaccessibilitybasedonreactionknowledgegraph AT chenhongming predictionofcompoundsynthesisaccessibilitybasedonreactionknowledgegraph |