Cargando…

Molecular property prediction by contrastive learning with attention-guided positive sample selection

MOTIVATION: Predicting molecular properties is one of the fundamental problems in drug design and discovery. In recent years, self-supervised learning (SSL) has shown its promising performance in image recognition, natural language processing, and single-cell data analysis. Contrastive learning (CL)...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jinxian, Guan, Jihong, Zhou, Shuigeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10188298/
https://www.ncbi.nlm.nih.gov/pubmed/37079731
http://dx.doi.org/10.1093/bioinformatics/btad258
Descripción
Sumario:MOTIVATION: Predicting molecular properties is one of the fundamental problems in drug design and discovery. In recent years, self-supervised learning (SSL) has shown its promising performance in image recognition, natural language processing, and single-cell data analysis. Contrastive learning (CL) is a typical SSL method used to learn the features of data so that the trained model can more effectively distinguish the data. One important issue of CL is how to select positive samples for each training example, which will significantly impact the performance of CL. RESULTS: In this article, we propose a new method for molecular property prediction (MPP) by Contrastive Learning with Attention-guided Positive-sample Selection (CLAPS). First, we generate positive samples for each training example based on an attention-guided selection scheme. Second, we employ a Transformer encoder to extract latent feature vectors and compute the contrastive loss aiming to distinguish positive and negative sample pairs. Finally, we use the trained encoder for predicting molecular properties. Experiments on various benchmark datasets show that our approach outperforms the state-of-the-art (SOTA) methods in most cases. AVAILABILITY AND IMPLEMENTATION: The code is publicly available at https://github.com/wangjx22/CLAPS.