Cargando…

Protein–ligand binding affinity prediction exploiting sequence constituent homology

MOTIVATION: Molecular docking is a commonly used approach for estimating binding conformations and their resultant binding affinities. Machine learning has been successfully deployed to enhance such affinity estimations. Many methods of varying complexity have been developed making use of some or al...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdel-Rehim, Abbi, Orhobor, Oghenejokpeme, Hang, Lou, Ni, Hao, King, Ross D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10463547/
https://www.ncbi.nlm.nih.gov/pubmed/37572302
http://dx.doi.org/10.1093/bioinformatics/btad502
_version_ 1785098258356371456
author Abdel-Rehim, Abbi
Orhobor, Oghenejokpeme
Hang, Lou
Ni, Hao
King, Ross D
author_facet Abdel-Rehim, Abbi
Orhobor, Oghenejokpeme
Hang, Lou
Ni, Hao
King, Ross D
author_sort Abdel-Rehim, Abbi
collection PubMed
description MOTIVATION: Molecular docking is a commonly used approach for estimating binding conformations and their resultant binding affinities. Machine learning has been successfully deployed to enhance such affinity estimations. Many methods of varying complexity have been developed making use of some or all the spatial and categorical information available in these structures. The evaluation of such methods has mainly been carried out using datasets from PDBbind. Particularly the Comparative Assessment of Scoring Functions (CASF) 2007, 2013, and 2016 datasets with dedicated test sets. This work demonstrates that only a small number of simple descriptors is necessary to efficiently estimate binding affinity for these complexes without the need to know the exact binding conformation of a ligand. RESULTS: The developed approach of using a small number of ligand and protein descriptors in conjunction with gradient boosting trees demonstrates high performance on the CASF datasets. This includes the commonly used benchmark CASF2016 where it appears to perform better than any other approach. This methodology is also useful for datasets where the spatial relationship between the ligand and protein is unknown as demonstrated using a large ChEMBL-derived dataset. AVAILABILITY AND IMPLEMENTATION: Code and data uploaded to https://github.com/abbiAR/PLBAffinity.
format Online
Article
Text
id pubmed-10463547
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-104635472023-08-30 Protein–ligand binding affinity prediction exploiting sequence constituent homology Abdel-Rehim, Abbi Orhobor, Oghenejokpeme Hang, Lou Ni, Hao King, Ross D Bioinformatics Original Paper MOTIVATION: Molecular docking is a commonly used approach for estimating binding conformations and their resultant binding affinities. Machine learning has been successfully deployed to enhance such affinity estimations. Many methods of varying complexity have been developed making use of some or all the spatial and categorical information available in these structures. The evaluation of such methods has mainly been carried out using datasets from PDBbind. Particularly the Comparative Assessment of Scoring Functions (CASF) 2007, 2013, and 2016 datasets with dedicated test sets. This work demonstrates that only a small number of simple descriptors is necessary to efficiently estimate binding affinity for these complexes without the need to know the exact binding conformation of a ligand. RESULTS: The developed approach of using a small number of ligand and protein descriptors in conjunction with gradient boosting trees demonstrates high performance on the CASF datasets. This includes the commonly used benchmark CASF2016 where it appears to perform better than any other approach. This methodology is also useful for datasets where the spatial relationship between the ligand and protein is unknown as demonstrated using a large ChEMBL-derived dataset. AVAILABILITY AND IMPLEMENTATION: Code and data uploaded to https://github.com/abbiAR/PLBAffinity. Oxford University Press 2023-08-12 /pmc/articles/PMC10463547/ /pubmed/37572302 http://dx.doi.org/10.1093/bioinformatics/btad502 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Abdel-Rehim, Abbi
Orhobor, Oghenejokpeme
Hang, Lou
Ni, Hao
King, Ross D
Protein–ligand binding affinity prediction exploiting sequence constituent homology
title Protein–ligand binding affinity prediction exploiting sequence constituent homology
title_full Protein–ligand binding affinity prediction exploiting sequence constituent homology
title_fullStr Protein–ligand binding affinity prediction exploiting sequence constituent homology
title_full_unstemmed Protein–ligand binding affinity prediction exploiting sequence constituent homology
title_short Protein–ligand binding affinity prediction exploiting sequence constituent homology
title_sort protein–ligand binding affinity prediction exploiting sequence constituent homology
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10463547/
https://www.ncbi.nlm.nih.gov/pubmed/37572302
http://dx.doi.org/10.1093/bioinformatics/btad502
work_keys_str_mv AT abdelrehimabbi proteinligandbindingaffinitypredictionexploitingsequenceconstituenthomology
AT orhoboroghenejokpeme proteinligandbindingaffinitypredictionexploitingsequenceconstituenthomology
AT hanglou proteinligandbindingaffinitypredictionexploitingsequenceconstituenthomology
AT nihao proteinligandbindingaffinitypredictionexploitingsequenceconstituenthomology
AT kingrossd proteinligandbindingaffinitypredictionexploitingsequenceconstituenthomology