Cargando…
Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction mod...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
International Association of Physical Chemists
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10567068/ https://www.ncbi.nlm.nih.gov/pubmed/37829322 http://dx.doi.org/10.5599/admet.1879 |
_version_ | 1785119045053317120 |
---|---|
author | Avdeef, Alex |
author_facet | Avdeef, Alex |
author_sort | Avdeef, Alex |
collection | PubMed |
description | Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Φ,B) models with the provided best-fit parameters, where Φ is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Φ < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended. |
format | Online Article Text |
id | pubmed-10567068 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | International Association of Physical Chemists |
record_format | MEDLINE/PubMed |
spelling | pubmed-105670682023-10-12 Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression Avdeef, Alex ADMET DMPK Review Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Φ,B) models with the provided best-fit parameters, where Φ is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Φ < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended. International Association of Physical Chemists 2023-08-21 /pmc/articles/PMC10567068/ /pubmed/37829322 http://dx.doi.org/10.5599/admet.1879 Text en Copyright © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ). |
spellingShingle | Review Avdeef, Alex Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression |
title | Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression |
title_full | Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression |
title_fullStr | Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression |
title_full_unstemmed | Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression |
title_short | Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression |
title_sort | mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (mw<2000) accuracy near that of random forest regression |
topic | Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10567068/ https://www.ncbi.nlm.nih.gov/pubmed/37829322 http://dx.doi.org/10.5599/admet.1879 |
work_keys_str_mv | AT avdeefalex mechanisticallytransparentmodelsforpredictingaqueoussolubilityofrigidslightlyflexibleandveryflexibledrugsmw2000accuracynearthatofrandomforestregression |