Cargando…

Predictive Modeling of PROTAC Cell Permeability with Machine Learning

[Image: see text] Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binar...

Descripción completa

Detalles Bibliográficos
Autores principales: Poongavanam, Vasanthanathan, Kölling, Florian, Giese, Anja, Göller, Andreas H., Lehmann, Lutz, Meibom, Daniel, Kihlberg, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933238/
https://www.ncbi.nlm.nih.gov/pubmed/36816707
http://dx.doi.org/10.1021/acsomega.2c07717
_version_ 1784889631190286336
author Poongavanam, Vasanthanathan
Kölling, Florian
Giese, Anja
Göller, Andreas H.
Lehmann, Lutz
Meibom, Daniel
Kihlberg, Jan
author_facet Poongavanam, Vasanthanathan
Kölling, Florian
Giese, Anja
Göller, Andreas H.
Lehmann, Lutz
Meibom, Daniel
Kihlberg, Jan
author_sort Poongavanam, Vasanthanathan
collection PubMed
description [Image: see text] Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using 17 simple descriptors for large and structurally diverse sets of cereblon (CRBN) and von Hippel–Lindau (VHL) PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models performed best and predicted the permeability of a blinded test set with >80% accuracy (k ≥ 0.57). Models retrained by combining the original training and the blinded test set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the imbalanced nature of the CRBN datasets. All descriptors contributed to the models, but size and lipophilicity were the most important. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process.
format Online
Article
Text
id pubmed-9933238
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-99332382023-02-17 Predictive Modeling of PROTAC Cell Permeability with Machine Learning Poongavanam, Vasanthanathan Kölling, Florian Giese, Anja Göller, Andreas H. Lehmann, Lutz Meibom, Daniel Kihlberg, Jan ACS Omega [Image: see text] Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using 17 simple descriptors for large and structurally diverse sets of cereblon (CRBN) and von Hippel–Lindau (VHL) PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models performed best and predicted the permeability of a blinded test set with >80% accuracy (k ≥ 0.57). Models retrained by combining the original training and the blinded test set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the imbalanced nature of the CRBN datasets. All descriptors contributed to the models, but size and lipophilicity were the most important. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process. American Chemical Society 2023-02-01 /pmc/articles/PMC9933238/ /pubmed/36816707 http://dx.doi.org/10.1021/acsomega.2c07717 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Poongavanam, Vasanthanathan
Kölling, Florian
Giese, Anja
Göller, Andreas H.
Lehmann, Lutz
Meibom, Daniel
Kihlberg, Jan
Predictive Modeling of PROTAC Cell Permeability with Machine Learning
title Predictive Modeling of PROTAC Cell Permeability with Machine Learning
title_full Predictive Modeling of PROTAC Cell Permeability with Machine Learning
title_fullStr Predictive Modeling of PROTAC Cell Permeability with Machine Learning
title_full_unstemmed Predictive Modeling of PROTAC Cell Permeability with Machine Learning
title_short Predictive Modeling of PROTAC Cell Permeability with Machine Learning
title_sort predictive modeling of protac cell permeability with machine learning
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933238/
https://www.ncbi.nlm.nih.gov/pubmed/36816707
http://dx.doi.org/10.1021/acsomega.2c07717
work_keys_str_mv AT poongavanamvasanthanathan predictivemodelingofprotaccellpermeabilitywithmachinelearning
AT kollingflorian predictivemodelingofprotaccellpermeabilitywithmachinelearning
AT gieseanja predictivemodelingofprotaccellpermeabilitywithmachinelearning
AT gollerandreash predictivemodelingofprotaccellpermeabilitywithmachinelearning
AT lehmannlutz predictivemodelingofprotaccellpermeabilitywithmachinelearning
AT meibomdaniel predictivemodelingofprotaccellpermeabilitywithmachinelearning
AT kihlbergjan predictivemodelingofprotaccellpermeabilitywithmachinelearning