Cargando…

Precise modelling and interpretation of bioactivities of ligands targeting G protein-coupled receptors

MOTIVATION: Accurate prediction and interpretation of ligand bioactivities are essential for virtual screening and drug discovery. Unfortunately, many important drug targets lack experimental data about the ligand bioactivities; this is particularly true for G protein-coupled receptors (GPCRs), whic...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jiansheng, Liu, Ben, Chan, Wallace K B, Wu, Weijian, Pang, Tao, Hu, Haifeng, Yan, Shancheng, Ke, Xiaoyan, Zhang, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612825/
https://www.ncbi.nlm.nih.gov/pubmed/31510691
http://dx.doi.org/10.1093/bioinformatics/btz336
Descripción
Sumario:MOTIVATION: Accurate prediction and interpretation of ligand bioactivities are essential for virtual screening and drug discovery. Unfortunately, many important drug targets lack experimental data about the ligand bioactivities; this is particularly true for G protein-coupled receptors (GPCRs), which account for the targets of about a third of drugs currently on the market. Computational approaches with the potential of precise assessment of ligand bioactivities and determination of key substructural features which determine ligand bioactivities are needed to address this issue. RESULTS: A new method, SED, was proposed to predict ligand bioactivities and to recognize key substructures associated with GPCRs through the coupling of screening for Lasso of long extended-connectivity fingerprints (ECFPs) with deep neural network training. The SED pipeline contains three successive steps: (i) representation of long ECFPs for ligand molecules, (ii) feature selection by screening for Lasso of ECFPs and (iii) bioactivity prediction through a deep neural network regression model. The method was examined on a set of 16 representative GPCRs that cover most subfamilies of human GPCRs, where each has 300–5000 ligand associations. The results show that SED achieves excellent performance in modelling ligand bioactivities, especially for those in the GPCR datasets without sufficient ligand associations, where SED improved the baseline predictors by 12% in correlation coefficient (r(2)) and 19% in root mean square error. Detail data analyses suggest that the major advantage of SED lies on its ability to detect substructures from long ECFPs which significantly improves the predictive performance. AVAILABILITY AND IMPLEMENTATION: The source code and datasets of SED are freely available at https://zhanglab.ccmb.med.umich.edu/SED/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.