Cargando…

Improving Chemical Reaction Prediction with Unlabeled Data

Predicting products of organic chemical reactions is useful in chemical sciences, especially when one or more reactants are new organics. However, the performance of traditional learning models heavily relies on high-quality labeled data. In this work, to utilize unlabeled data for better prediction...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Yu, Zhang, Yuyang, Wong, Ka-Chun, Shi, Meixia, Peng, Chengbin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9506495/
https://www.ncbi.nlm.nih.gov/pubmed/36144703
http://dx.doi.org/10.3390/molecules27185967
_version_ 1784796737170309120
author Xie, Yu
Zhang, Yuyang
Wong, Ka-Chun
Shi, Meixia
Peng, Chengbin
author_facet Xie, Yu
Zhang, Yuyang
Wong, Ka-Chun
Shi, Meixia
Peng, Chengbin
author_sort Xie, Yu
collection PubMed
description Predicting products of organic chemical reactions is useful in chemical sciences, especially when one or more reactants are new organics. However, the performance of traditional learning models heavily relies on high-quality labeled data. In this work, to utilize unlabeled data for better prediction performance, we propose a method that combines semi-supervised learning with graph convolutional neural networks for chemical reaction prediction. First, we propose a Mean Teacher Weisfeiler–Lehman Network to find the reaction centers. Then, we construct the candidate product set. Finally, we use an Improved Weisfeiler–Lehman Difference Network to rank candidate products. Experimental results demonstrate that, with 400k labeled data, our framework can improve the top-5 accuracy by 0.7% using 35k unlabeled data. When the proportion of unlabeled data increases, the performance gain can be larger. For example, with 80k labeled data and 35k unlabeled data, the performance gain with our framework can be 1.8%.
format Online
Article
Text
id pubmed-9506495
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-95064952022-09-24 Improving Chemical Reaction Prediction with Unlabeled Data Xie, Yu Zhang, Yuyang Wong, Ka-Chun Shi, Meixia Peng, Chengbin Molecules Article Predicting products of organic chemical reactions is useful in chemical sciences, especially when one or more reactants are new organics. However, the performance of traditional learning models heavily relies on high-quality labeled data. In this work, to utilize unlabeled data for better prediction performance, we propose a method that combines semi-supervised learning with graph convolutional neural networks for chemical reaction prediction. First, we propose a Mean Teacher Weisfeiler–Lehman Network to find the reaction centers. Then, we construct the candidate product set. Finally, we use an Improved Weisfeiler–Lehman Difference Network to rank candidate products. Experimental results demonstrate that, with 400k labeled data, our framework can improve the top-5 accuracy by 0.7% using 35k unlabeled data. When the proportion of unlabeled data increases, the performance gain can be larger. For example, with 80k labeled data and 35k unlabeled data, the performance gain with our framework can be 1.8%. MDPI 2022-09-14 /pmc/articles/PMC9506495/ /pubmed/36144703 http://dx.doi.org/10.3390/molecules27185967 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xie, Yu
Zhang, Yuyang
Wong, Ka-Chun
Shi, Meixia
Peng, Chengbin
Improving Chemical Reaction Prediction with Unlabeled Data
title Improving Chemical Reaction Prediction with Unlabeled Data
title_full Improving Chemical Reaction Prediction with Unlabeled Data
title_fullStr Improving Chemical Reaction Prediction with Unlabeled Data
title_full_unstemmed Improving Chemical Reaction Prediction with Unlabeled Data
title_short Improving Chemical Reaction Prediction with Unlabeled Data
title_sort improving chemical reaction prediction with unlabeled data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9506495/
https://www.ncbi.nlm.nih.gov/pubmed/36144703
http://dx.doi.org/10.3390/molecules27185967
work_keys_str_mv AT xieyu improvingchemicalreactionpredictionwithunlabeleddata
AT zhangyuyang improvingchemicalreactionpredictionwithunlabeleddata
AT wongkachun improvingchemicalreactionpredictionwithunlabeleddata
AT shimeixia improvingchemicalreactionpredictionwithunlabeleddata
AT pengchengbin improvingchemicalreactionpredictionwithunlabeleddata