Cargando…

Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center

Effective synthesis planning powered by deep learning (DL) can significantly accelerate the discovery of new drugs and materials. However, most DL-assisted synthesis planning methods offer either none or very limited capability to recommend suitable reaction conditions (RCs) for their reaction predi...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xiaorui, Hsieh, Chang-Yu, Yin, Xiaodan, Wang, Jike, Li, Yuquan, Deng, Yafeng, Jiang, Dejun, Wu, Zhenxing, Du, Hongyan, Chen, Hongming, Li, Yun, Liu, Huanxiang, Wang, Yuwei, Luo, Pei, Hou, Tingjun, Yao, Xiaojun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AAAS 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10578430/
https://www.ncbi.nlm.nih.gov/pubmed/37849643
http://dx.doi.org/10.34133/research.0231
_version_ 1785121518124007424
author Wang, Xiaorui
Hsieh, Chang-Yu
Yin, Xiaodan
Wang, Jike
Li, Yuquan
Deng, Yafeng
Jiang, Dejun
Wu, Zhenxing
Du, Hongyan
Chen, Hongming
Li, Yun
Liu, Huanxiang
Wang, Yuwei
Luo, Pei
Hou, Tingjun
Yao, Xiaojun
author_facet Wang, Xiaorui
Hsieh, Chang-Yu
Yin, Xiaodan
Wang, Jike
Li, Yuquan
Deng, Yafeng
Jiang, Dejun
Wu, Zhenxing
Du, Hongyan
Chen, Hongming
Li, Yun
Liu, Huanxiang
Wang, Yuwei
Luo, Pei
Hou, Tingjun
Yao, Xiaojun
author_sort Wang, Xiaorui
collection PubMed
description Effective synthesis planning powered by deep learning (DL) can significantly accelerate the discovery of new drugs and materials. However, most DL-assisted synthesis planning methods offer either none or very limited capability to recommend suitable reaction conditions (RCs) for their reaction predictions. Currently, the prediction of RCs with a DL framework is hindered by several factors, including: (a) lack of a standardized dataset for benchmarking, (b) lack of a general prediction model with powerful representation, and (c) lack of interpretability. To address these issues, we first created 2 standardized RC datasets covering a broad range of reaction classes and then proposed a powerful and interpretable Transformer-based RC predictor named Parrot. Through careful design of the model architecture, pretraining method, and training strategy, Parrot improved the overall top-3 prediction accuracy on catalysis, solvents, and other reagents by as much as 13.44%, compared to the best previous model on a newly curated dataset. Additionally, the mean absolute error of the predicted temperatures was reduced by about 4 °C. Furthermore, Parrot manifests strong generalization capacity with superior cross-chemical-space prediction accuracy. Attention analysis indicates that Parrot effectively captures crucial chemical information and exhibits a high level of interpretability in the prediction of RCs. The proposed model Parrot exemplifies how modern neural network architecture when appropriately pretrained can be versatile in making reliable, generalizable, and interpretable recommendation for RCs even when the underlying training dataset may still be limited in diversity.
format Online
Article
Text
id pubmed-10578430
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher AAAS
record_format MEDLINE/PubMed
spelling pubmed-105784302023-10-17 Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center Wang, Xiaorui Hsieh, Chang-Yu Yin, Xiaodan Wang, Jike Li, Yuquan Deng, Yafeng Jiang, Dejun Wu, Zhenxing Du, Hongyan Chen, Hongming Li, Yun Liu, Huanxiang Wang, Yuwei Luo, Pei Hou, Tingjun Yao, Xiaojun Research (Wash D C) Research Article Effective synthesis planning powered by deep learning (DL) can significantly accelerate the discovery of new drugs and materials. However, most DL-assisted synthesis planning methods offer either none or very limited capability to recommend suitable reaction conditions (RCs) for their reaction predictions. Currently, the prediction of RCs with a DL framework is hindered by several factors, including: (a) lack of a standardized dataset for benchmarking, (b) lack of a general prediction model with powerful representation, and (c) lack of interpretability. To address these issues, we first created 2 standardized RC datasets covering a broad range of reaction classes and then proposed a powerful and interpretable Transformer-based RC predictor named Parrot. Through careful design of the model architecture, pretraining method, and training strategy, Parrot improved the overall top-3 prediction accuracy on catalysis, solvents, and other reagents by as much as 13.44%, compared to the best previous model on a newly curated dataset. Additionally, the mean absolute error of the predicted temperatures was reduced by about 4 °C. Furthermore, Parrot manifests strong generalization capacity with superior cross-chemical-space prediction accuracy. Attention analysis indicates that Parrot effectively captures crucial chemical information and exhibits a high level of interpretability in the prediction of RCs. The proposed model Parrot exemplifies how modern neural network architecture when appropriately pretrained can be versatile in making reliable, generalizable, and interpretable recommendation for RCs even when the underlying training dataset may still be limited in diversity. AAAS 2023-10-16 /pmc/articles/PMC10578430/ /pubmed/37849643 http://dx.doi.org/10.34133/research.0231 Text en Copyright © 2023 Xiaorui Wang et al. https://creativecommons.org/licenses/by/4.0/Exclusive licensee Science and Technology Review Publishing House. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution License 4.0 (CC BY 4.0) (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Wang, Xiaorui
Hsieh, Chang-Yu
Yin, Xiaodan
Wang, Jike
Li, Yuquan
Deng, Yafeng
Jiang, Dejun
Wu, Zhenxing
Du, Hongyan
Chen, Hongming
Li, Yun
Liu, Huanxiang
Wang, Yuwei
Luo, Pei
Hou, Tingjun
Yao, Xiaojun
Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center
title Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center
title_full Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center
title_fullStr Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center
title_full_unstemmed Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center
title_short Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center
title_sort generic interpretable reaction condition predictions with open reaction condition datasets and unsupervised learning of reaction center
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10578430/
https://www.ncbi.nlm.nih.gov/pubmed/37849643
http://dx.doi.org/10.34133/research.0231
work_keys_str_mv AT wangxiaorui genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT hsiehchangyu genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT yinxiaodan genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT wangjike genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT liyuquan genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT dengyafeng genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT jiangdejun genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT wuzhenxing genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT duhongyan genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT chenhongming genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT liyun genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT liuhuanxiang genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT wangyuwei genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT luopei genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT houtingjun genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter
AT yaoxiaojun genericinterpretablereactionconditionpredictionswithopenreactionconditiondatasetsandunsupervisedlearningofreactioncenter