Cargando…

3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors

Proteochemometric (PCM) modelling is a powerful computational drug discovery tool used in bioactivity prediction of potential drug candidates relying on both chemical and protein information. In PCM features are computed to describe small molecules and proteins, which directly impact the quality of...

Descripción completa

Detalles Bibliográficos
Autores principales: Gorostiola González, Marina, van den Broek, Remco L., Braun, Thomas G. M., Chatzopoulou, Magdalini, Jespers, Willem, IJzerman, Adriaan P., Heitman, Laura H., van Westen, Gerard J. P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10463931/
https://www.ncbi.nlm.nih.gov/pubmed/37641107
http://dx.doi.org/10.1186/s13321-023-00745-5
_version_ 1785098347748524032
author Gorostiola González, Marina
van den Broek, Remco L.
Braun, Thomas G. M.
Chatzopoulou, Magdalini
Jespers, Willem
IJzerman, Adriaan P.
Heitman, Laura H.
van Westen, Gerard J. P.
author_facet Gorostiola González, Marina
van den Broek, Remco L.
Braun, Thomas G. M.
Chatzopoulou, Magdalini
Jespers, Willem
IJzerman, Adriaan P.
Heitman, Laura H.
van Westen, Gerard J. P.
author_sort Gorostiola González, Marina
collection PubMed
description Proteochemometric (PCM) modelling is a powerful computational drug discovery tool used in bioactivity prediction of potential drug candidates relying on both chemical and protein information. In PCM features are computed to describe small molecules and proteins, which directly impact the quality of the predictive models. State-of-the-art protein descriptors, however, are calculated from the protein sequence and neglect the dynamic nature of proteins. This dynamic nature can be computationally simulated with molecular dynamics (MD). Here, novel 3D dynamic protein descriptors (3DDPDs) were designed to be applied in bioactivity prediction tasks with PCM models. As a test case, publicly available G protein-coupled receptor (GPCR) MD data from GPCRmd was used. GPCRs are membrane-bound proteins, which are activated by hormones and neurotransmitters, and constitute an important target family for drug discovery. GPCRs exist in different conformational states that allow the transmission of diverse signals and that can be modified by ligand interactions, among other factors. To translate the MD-encoded protein dynamics two types of 3DDPDs were considered: one-hot encoded residue-specific (rs) and embedding-like protein-specific (ps) 3DDPDs. The descriptors were developed by calculating distributions of trajectory coordinates and partial charges, applying dimensionality reduction, and subsequently condensing them into vectors per residue or protein, respectively. 3DDPDs were benchmarked on several PCM tasks against state-of-the-art non-dynamic protein descriptors. Our rs- and ps3DDPDs outperformed non-dynamic descriptors in regression tasks using a temporal split and showed comparable performance with a random split and in all classification tasks. Combinations of non-dynamic descriptors with 3DDPDs did not result in increased performance. Finally, the power of 3DDPDs to capture dynamic fluctuations in mutant GPCRs was explored. The results presented here show the potential of including protein dynamic information on machine learning tasks, specifically bioactivity prediction, and open opportunities for applications in drug discovery, including oncology. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00745-5.
format Online
Article
Text
id pubmed-10463931
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-104639312023-08-30 3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors Gorostiola González, Marina van den Broek, Remco L. Braun, Thomas G. M. Chatzopoulou, Magdalini Jespers, Willem IJzerman, Adriaan P. Heitman, Laura H. van Westen, Gerard J. P. J Cheminform Research Proteochemometric (PCM) modelling is a powerful computational drug discovery tool used in bioactivity prediction of potential drug candidates relying on both chemical and protein information. In PCM features are computed to describe small molecules and proteins, which directly impact the quality of the predictive models. State-of-the-art protein descriptors, however, are calculated from the protein sequence and neglect the dynamic nature of proteins. This dynamic nature can be computationally simulated with molecular dynamics (MD). Here, novel 3D dynamic protein descriptors (3DDPDs) were designed to be applied in bioactivity prediction tasks with PCM models. As a test case, publicly available G protein-coupled receptor (GPCR) MD data from GPCRmd was used. GPCRs are membrane-bound proteins, which are activated by hormones and neurotransmitters, and constitute an important target family for drug discovery. GPCRs exist in different conformational states that allow the transmission of diverse signals and that can be modified by ligand interactions, among other factors. To translate the MD-encoded protein dynamics two types of 3DDPDs were considered: one-hot encoded residue-specific (rs) and embedding-like protein-specific (ps) 3DDPDs. The descriptors were developed by calculating distributions of trajectory coordinates and partial charges, applying dimensionality reduction, and subsequently condensing them into vectors per residue or protein, respectively. 3DDPDs were benchmarked on several PCM tasks against state-of-the-art non-dynamic protein descriptors. Our rs- and ps3DDPDs outperformed non-dynamic descriptors in regression tasks using a temporal split and showed comparable performance with a random split and in all classification tasks. Combinations of non-dynamic descriptors with 3DDPDs did not result in increased performance. Finally, the power of 3DDPDs to capture dynamic fluctuations in mutant GPCRs was explored. The results presented here show the potential of including protein dynamic information on machine learning tasks, specifically bioactivity prediction, and open opportunities for applications in drug discovery, including oncology. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-023-00745-5. Springer International Publishing 2023-08-28 /pmc/articles/PMC10463931/ /pubmed/37641107 http://dx.doi.org/10.1186/s13321-023-00745-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Gorostiola González, Marina
van den Broek, Remco L.
Braun, Thomas G. M.
Chatzopoulou, Magdalini
Jespers, Willem
IJzerman, Adriaan P.
Heitman, Laura H.
van Westen, Gerard J. P.
3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors
title 3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors
title_full 3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors
title_fullStr 3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors
title_full_unstemmed 3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors
title_short 3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors
title_sort 3ddpds: describing protein dynamics for proteochemometric bioactivity prediction. a case for (mutant) g protein-coupled receptors
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10463931/
https://www.ncbi.nlm.nih.gov/pubmed/37641107
http://dx.doi.org/10.1186/s13321-023-00745-5
work_keys_str_mv AT gorostiolagonzalezmarina 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT vandenbroekremcol 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT braunthomasgm 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT chatzopouloumagdalini 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT jesperswillem 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT ijzermanadriaanp 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT heitmanlaurah 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors
AT vanwestengerardjp 3ddpdsdescribingproteindynamicsforproteochemometricbioactivitypredictionacaseformutantgproteincoupledreceptors