Cargando…

Knowledge-Embedded Message-Passing Neural Networks: Improving Molecular Property Prediction with Human Knowledge

[Image: see text] The graph neural network (GNN) has become a promising method to predict molecular properties with end-to-end supervision, as it can learn molecular features directly from chemical graphs in a black-box manner. However, to achieve high prediction accuracy, it is essential to supervi...

Descripción completa

Detalles Bibliográficos
Autor principal: Hasebe, Tatsuya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8552328/
https://www.ncbi.nlm.nih.gov/pubmed/34722995
http://dx.doi.org/10.1021/acsomega.1c03839
Descripción
Sumario:[Image: see text] The graph neural network (GNN) has become a promising method to predict molecular properties with end-to-end supervision, as it can learn molecular features directly from chemical graphs in a black-box manner. However, to achieve high prediction accuracy, it is essential to supervise a huge amount of property data, which is often accompanied by a high property experiment cost. Prior to the deep learning method, descriptor-based quantitative structure–property relationships (QSPR) studies have investigated physical and chemical knowledge to manually design descriptors for effectively predicting properties. In this study, we extend a message-passing neural network (MPNN) to include a novel MPNN architecture called the knowledge-embedded MPNN (KEMPNN) that can be supervised together with nonquantitative knowledge annotations by human experts on a chemical graph that contains information on the important substructure of a molecule and its effect on the target property (e.g., positive or negative effect). We evaluated the performance of the KEMPNN in a small training data setting using a physical chemistry dataset in MoleculeNet (ESOL, FreeSolv, Lipophilicity) and a polymer property (glass-transition temperature) dataset with virtual knowledge annotations. The results demonstrate that the KEMPNN with knowledge supervision can improve the prediction accuracy obtained from the MPNN. The results also demonstrate that the accuracy of the KEMPNN is better than or comparable to those of descriptor-based methods even in the case of small training data.