Cargando…
DeepSA: a deep-learning driven predictor of compound synthesis accessibility
With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic acc...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10621138/ https://www.ncbi.nlm.nih.gov/pubmed/37919805 http://dx.doi.org/10.1186/s13321-023-00771-3 |
_version_ | 1785130351039873024 |
---|---|
author | Wang, Shihang Wang, Lin Li, Fenglei Bai, Fang |
author_facet | Wang, Shihang Wang, Lin Li, Fenglei Bai, Fang |
author_sort | Wang, Shihang |
collection | PubMed |
description | With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic accessibility of compounds. In this study, a deep learning based computational model called DeepSA, was proposed to predict the synthesis accessibility of compounds, which provides a useful tool to choose molecules. DeepSA is a chemical language model that was developed by training on a dataset of 3,593,053 molecules using various natural language processing (NLP) algorithms, offering advantages over state-of-the-art methods and having a much higher area under the receiver operating characteristic curve (AUROC), i.e., 89.6%, in discriminating those molecules that are difficult to synthesize. This helps users select less expensive molecules for synthesis, reducing the time and cost required for drug discovery and development. Interestingly, a comparison of DeepSA with a Graph Attention-based method shows that using SMILES alone can also efficiently visualize and extract compound’s informative features. DeepSA is available online on the below web server (https://bailab.siais.shanghaitech.edu.cn/services/deepsa/) of our group, and the code is available at https://github.com/Shihang-Wang-58/DeepSA. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00771-3. |
format | Online Article Text |
id | pubmed-10621138 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-106211382023-11-03 DeepSA: a deep-learning driven predictor of compound synthesis accessibility Wang, Shihang Wang, Lin Li, Fenglei Bai, Fang J Cheminform Research With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic accessibility of compounds. In this study, a deep learning based computational model called DeepSA, was proposed to predict the synthesis accessibility of compounds, which provides a useful tool to choose molecules. DeepSA is a chemical language model that was developed by training on a dataset of 3,593,053 molecules using various natural language processing (NLP) algorithms, offering advantages over state-of-the-art methods and having a much higher area under the receiver operating characteristic curve (AUROC), i.e., 89.6%, in discriminating those molecules that are difficult to synthesize. This helps users select less expensive molecules for synthesis, reducing the time and cost required for drug discovery and development. Interestingly, a comparison of DeepSA with a Graph Attention-based method shows that using SMILES alone can also efficiently visualize and extract compound’s informative features. DeepSA is available online on the below web server (https://bailab.siais.shanghaitech.edu.cn/services/deepsa/) of our group, and the code is available at https://github.com/Shihang-Wang-58/DeepSA. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00771-3. Springer International Publishing 2023-11-02 /pmc/articles/PMC10621138/ /pubmed/37919805 http://dx.doi.org/10.1186/s13321-023-00771-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Wang, Shihang Wang, Lin Li, Fenglei Bai, Fang DeepSA: a deep-learning driven predictor of compound synthesis accessibility |
title | DeepSA: a deep-learning driven predictor of compound synthesis accessibility |
title_full | DeepSA: a deep-learning driven predictor of compound synthesis accessibility |
title_fullStr | DeepSA: a deep-learning driven predictor of compound synthesis accessibility |
title_full_unstemmed | DeepSA: a deep-learning driven predictor of compound synthesis accessibility |
title_short | DeepSA: a deep-learning driven predictor of compound synthesis accessibility |
title_sort | deepsa: a deep-learning driven predictor of compound synthesis accessibility |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10621138/ https://www.ncbi.nlm.nih.gov/pubmed/37919805 http://dx.doi.org/10.1186/s13321-023-00771-3 |
work_keys_str_mv | AT wangshihang deepsaadeeplearningdrivenpredictorofcompoundsynthesisaccessibility AT wanglin deepsaadeeplearningdrivenpredictorofcompoundsynthesisaccessibility AT lifenglei deepsaadeeplearningdrivenpredictorofcompoundsynthesisaccessibility AT baifang deepsaadeeplearningdrivenpredictorofcompoundsynthesisaccessibility |