Cargando…
Predictive Modeling of PROTAC Cell Permeability with Machine Learning
[Image: see text] Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binar...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933238/ https://www.ncbi.nlm.nih.gov/pubmed/36816707 http://dx.doi.org/10.1021/acsomega.2c07717 |
_version_ | 1784889631190286336 |
---|---|
author | Poongavanam, Vasanthanathan Kölling, Florian Giese, Anja Göller, Andreas H. Lehmann, Lutz Meibom, Daniel Kihlberg, Jan |
author_facet | Poongavanam, Vasanthanathan Kölling, Florian Giese, Anja Göller, Andreas H. Lehmann, Lutz Meibom, Daniel Kihlberg, Jan |
author_sort | Poongavanam, Vasanthanathan |
collection | PubMed |
description | [Image: see text] Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using 17 simple descriptors for large and structurally diverse sets of cereblon (CRBN) and von Hippel–Lindau (VHL) PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models performed best and predicted the permeability of a blinded test set with >80% accuracy (k ≥ 0.57). Models retrained by combining the original training and the blinded test set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the imbalanced nature of the CRBN datasets. All descriptors contributed to the models, but size and lipophilicity were the most important. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process. |
format | Online Article Text |
id | pubmed-9933238 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-99332382023-02-17 Predictive Modeling of PROTAC Cell Permeability with Machine Learning Poongavanam, Vasanthanathan Kölling, Florian Giese, Anja Göller, Andreas H. Lehmann, Lutz Meibom, Daniel Kihlberg, Jan ACS Omega [Image: see text] Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using 17 simple descriptors for large and structurally diverse sets of cereblon (CRBN) and von Hippel–Lindau (VHL) PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models performed best and predicted the permeability of a blinded test set with >80% accuracy (k ≥ 0.57). Models retrained by combining the original training and the blinded test set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the imbalanced nature of the CRBN datasets. All descriptors contributed to the models, but size and lipophilicity were the most important. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process. American Chemical Society 2023-02-01 /pmc/articles/PMC9933238/ /pubmed/36816707 http://dx.doi.org/10.1021/acsomega.2c07717 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Poongavanam, Vasanthanathan Kölling, Florian Giese, Anja Göller, Andreas H. Lehmann, Lutz Meibom, Daniel Kihlberg, Jan Predictive Modeling of PROTAC Cell Permeability with Machine Learning |
title | Predictive Modeling
of PROTAC Cell Permeability with
Machine Learning |
title_full | Predictive Modeling
of PROTAC Cell Permeability with
Machine Learning |
title_fullStr | Predictive Modeling
of PROTAC Cell Permeability with
Machine Learning |
title_full_unstemmed | Predictive Modeling
of PROTAC Cell Permeability with
Machine Learning |
title_short | Predictive Modeling
of PROTAC Cell Permeability with
Machine Learning |
title_sort | predictive modeling
of protac cell permeability with
machine learning |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933238/ https://www.ncbi.nlm.nih.gov/pubmed/36816707 http://dx.doi.org/10.1021/acsomega.2c07717 |
work_keys_str_mv | AT poongavanamvasanthanathan predictivemodelingofprotaccellpermeabilitywithmachinelearning AT kollingflorian predictivemodelingofprotaccellpermeabilitywithmachinelearning AT gieseanja predictivemodelingofprotaccellpermeabilitywithmachinelearning AT gollerandreash predictivemodelingofprotaccellpermeabilitywithmachinelearning AT lehmannlutz predictivemodelingofprotaccellpermeabilitywithmachinelearning AT meibomdaniel predictivemodelingofprotaccellpermeabilitywithmachinelearning AT kihlbergjan predictivemodelingofprotaccellpermeabilitywithmachinelearning |