Cargando…

Multi-instance learning of graph neural networks for aqueous pK(a) prediction

MOTIVATION: The acid dissociation constant (pK(a)) is a critical parameter to reflect the ionization ability of chemical compounds and is widely applied in a variety of industries. However, the experimental determination of pK(a) is intricate and time-consuming, especially for the exact determinatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Jiacheng, Li, Zhaojun, Wang, Guangchao, Fu, Zunyun, Zhong, Feisheng, Xu, Tingyang, Liu, Xiaomeng, Huang, Ziming, Liu, Xiaohong, Chen, Kaixian, Jiang, Hualiang, Zheng, Mingyue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8756178/
https://www.ncbi.nlm.nih.gov/pubmed/34643666
http://dx.doi.org/10.1093/bioinformatics/btab714
Descripción
Sumario:MOTIVATION: The acid dissociation constant (pK(a)) is a critical parameter to reflect the ionization ability of chemical compounds and is widely applied in a variety of industries. However, the experimental determination of pK(a) is intricate and time-consuming, especially for the exact determination of micro-pK(a) information at the atomic level. Hence, a fast and accurate prediction of pK(a) values of chemical compounds is of broad interest. RESULTS: Here, we compiled a large-scale pK(a) dataset containing 16 595 compounds with 17 489 pK(a) values. Based on this dataset, a novel pK(a) prediction model, named Graph-pK(a), was established using graph neural networks. Graph-pK(a) performed well on the prediction of macro-pK(a) values, with a mean absolute error around 0.55 and a coefficient of determination around 0.92 on the test dataset. Furthermore, combining multi-instance learning, Graph-pK(a) was also able to automatically deconvolute the predicted macro-pK(a) into discrete micro-pK(a) values. AVAILABILITY AND IMPLEMENTATION: The Graph-pK(a) model is now freely accessible via a web-based interface (https://pka.simm.ac.cn/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.