Cargando…

MetaRF: attention-based random forest for reaction yield prediction with a few trails

Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In th...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Kexin, Chen, Guangyong, Li, Junyou, Huang, Yuansheng, Wang, Ercheng, Hou, Tingjun, Heng, Pheng-Ann
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10084704/
https://www.ncbi.nlm.nih.gov/pubmed/37038222
http://dx.doi.org/10.1186/s13321-023-00715-x
_version_ 1785021795969007616
author Chen, Kexin
Chen, Guangyong
Li, Junyou
Huang, Yuansheng
Wang, Ercheng
Hou, Tingjun
Heng, Pheng-Ann
author_facet Chen, Kexin
Chen, Guangyong
Li, Junyou
Huang, Yuansheng
Wang, Ercheng
Hou, Tingjun
Heng, Pheng-Ann
author_sort Chen, Kexin
collection PubMed
description Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield prediction problem, which assists chemists in selecting high-yield reactions in a new chemical space only with a few experimental trials. To attack this challenge, we first put forth MetaRF, an attention-based random forest model specially designed for the few-shot yield prediction, where the attention weight of a random forest is automatically optimized by the meta-learning framework and can be quickly adapted to predict the performance of new reagents while given a few additional samples. To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method to determine valuable samples to be experimentally tested and then learned. Our methodology is evaluated on three different datasets and acquires satisfactory performance on few-shot prediction. In high-throughput experimentation (HTE) datasets, the average yield of our methodology’s top 10 high-yield reactions is relatively close to the results of ideal yield selection.
format Online
Article
Text
id pubmed-10084704
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-100847042023-04-11 MetaRF: attention-based random forest for reaction yield prediction with a few trails Chen, Kexin Chen, Guangyong Li, Junyou Huang, Yuansheng Wang, Ercheng Hou, Tingjun Heng, Pheng-Ann J Cheminform Research Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield prediction problem, which assists chemists in selecting high-yield reactions in a new chemical space only with a few experimental trials. To attack this challenge, we first put forth MetaRF, an attention-based random forest model specially designed for the few-shot yield prediction, where the attention weight of a random forest is automatically optimized by the meta-learning framework and can be quickly adapted to predict the performance of new reagents while given a few additional samples. To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method to determine valuable samples to be experimentally tested and then learned. Our methodology is evaluated on three different datasets and acquires satisfactory performance on few-shot prediction. In high-throughput experimentation (HTE) datasets, the average yield of our methodology’s top 10 high-yield reactions is relatively close to the results of ideal yield selection. Springer International Publishing 2023-04-10 /pmc/articles/PMC10084704/ /pubmed/37038222 http://dx.doi.org/10.1186/s13321-023-00715-x Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Chen, Kexin
Chen, Guangyong
Li, Junyou
Huang, Yuansheng
Wang, Ercheng
Hou, Tingjun
Heng, Pheng-Ann
MetaRF: attention-based random forest for reaction yield prediction with a few trails
title MetaRF: attention-based random forest for reaction yield prediction with a few trails
title_full MetaRF: attention-based random forest for reaction yield prediction with a few trails
title_fullStr MetaRF: attention-based random forest for reaction yield prediction with a few trails
title_full_unstemmed MetaRF: attention-based random forest for reaction yield prediction with a few trails
title_short MetaRF: attention-based random forest for reaction yield prediction with a few trails
title_sort metarf: attention-based random forest for reaction yield prediction with a few trails
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10084704/
https://www.ncbi.nlm.nih.gov/pubmed/37038222
http://dx.doi.org/10.1186/s13321-023-00715-x
work_keys_str_mv AT chenkexin metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails
AT chenguangyong metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails
AT lijunyou metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails
AT huangyuansheng metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails
AT wangercheng metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails
AT houtingjun metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails
AT hengphengann metarfattentionbasedrandomforestforreactionyieldpredictionwithafewtrails