Cargando…

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-for...

Descripción completa

Detalles Bibliográficos
Autores principales: Parks, Conor, Gaieb, Zied, Amaro, Rommie E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328444/
https://www.ncbi.nlm.nih.gov/pubmed/32671093
http://dx.doi.org/10.3389/fmolb.2020.00093
_version_ 1783552727316430848
author Parks, Conor
Gaieb, Zied
Amaro, Rommie E.
author_facet Parks, Conor
Gaieb, Zied
Amaro, Rommie E.
author_sort Parks, Conor
collection PubMed
description Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well-calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns.
format Online
Article
Text
id pubmed-7328444
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73284442020-07-14 An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models Parks, Conor Gaieb, Zied Amaro, Rommie E. Front Mol Biosci Molecular Biosciences Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well-calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns. Frontiers Media S.A. 2020-06-24 /pmc/articles/PMC7328444/ /pubmed/32671093 http://dx.doi.org/10.3389/fmolb.2020.00093 Text en Copyright © 2020 Parks, Gaieb and Amaro. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Parks, Conor
Gaieb, Zied
Amaro, Rommie E.
An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
title An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
title_full An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
title_fullStr An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
title_full_unstemmed An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
title_short An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
title_sort analysis of proteochemometric and conformal prediction machine learning protein-ligand binding affinity models
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328444/
https://www.ncbi.nlm.nih.gov/pubmed/32671093
http://dx.doi.org/10.3389/fmolb.2020.00093
work_keys_str_mv AT parksconor ananalysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels
AT gaiebzied ananalysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels
AT amarorommiee ananalysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels
AT parksconor analysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels
AT gaiebzied analysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels
AT amarorommiee analysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels