Cargando…
An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models
Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-for...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328444/ https://www.ncbi.nlm.nih.gov/pubmed/32671093 http://dx.doi.org/10.3389/fmolb.2020.00093 |
_version_ | 1783552727316430848 |
---|---|
author | Parks, Conor Gaieb, Zied Amaro, Rommie E. |
author_facet | Parks, Conor Gaieb, Zied Amaro, Rommie E. |
author_sort | Parks, Conor |
collection | PubMed |
description | Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well-calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns. |
format | Online Article Text |
id | pubmed-7328444 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73284442020-07-14 An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models Parks, Conor Gaieb, Zied Amaro, Rommie E. Front Mol Biosci Molecular Biosciences Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well-calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns. Frontiers Media S.A. 2020-06-24 /pmc/articles/PMC7328444/ /pubmed/32671093 http://dx.doi.org/10.3389/fmolb.2020.00093 Text en Copyright © 2020 Parks, Gaieb and Amaro. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Molecular Biosciences Parks, Conor Gaieb, Zied Amaro, Rommie E. An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models |
title | An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models |
title_full | An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models |
title_fullStr | An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models |
title_full_unstemmed | An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models |
title_short | An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models |
title_sort | analysis of proteochemometric and conformal prediction machine learning protein-ligand binding affinity models |
topic | Molecular Biosciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328444/ https://www.ncbi.nlm.nih.gov/pubmed/32671093 http://dx.doi.org/10.3389/fmolb.2020.00093 |
work_keys_str_mv | AT parksconor ananalysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels AT gaiebzied ananalysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels AT amarorommiee ananalysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels AT parksconor analysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels AT gaiebzied analysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels AT amarorommiee analysisofproteochemometricandconformalpredictionmachinelearningproteinligandbindingaffinitymodels |