Cargando…

Simplified, interpretable graph convolutional neural networks for small molecule activity prediction

We here present a streamlined, explainable graph convolutional neural network (gCNN) architecture for small molecule activity prediction. We first conduct a hyperparameter optimization across nearly 800 protein targets that produces a simplified gCNN QSAR architecture, and we observe that such a mod...

Descripción completa

Detalles Bibliográficos
Autores principales: Weber, Jeffrey K., Morrone, Joseph A., Bagchi, Sugato, Pabon, Jan D. Estrada, Kang, Seung-gu, Zhang, Leili, Cornell, Wendy D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9325818/
https://www.ncbi.nlm.nih.gov/pubmed/34817762
http://dx.doi.org/10.1007/s10822-021-00421-6
Descripción
Sumario:We here present a streamlined, explainable graph convolutional neural network (gCNN) architecture for small molecule activity prediction. We first conduct a hyperparameter optimization across nearly 800 protein targets that produces a simplified gCNN QSAR architecture, and we observe that such a model can yield performance improvements over both standard gCNN and RF methods on difficult-to-classify test sets. Additionally, we discuss how reductions in convolutional layer dimensions potentially speak to the “anatomical” needs of gCNNs with respect to radial coarse graining of molecular substructure. We augment this simplified architecture with saliency map technology that highlights molecular substructures relevant to activity, and we perform saliency analysis on nearly 100 data-rich protein targets. We show that resultant substructural clusters are useful visualization tools for understanding substructure-activity relationships. We go on to highlight connections between our models’ saliency predictions and observations made in the medicinal chemistry literature, focusing on four case studies of past lead finding and lead optimization campaigns. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10822-021-00421-6.