Cargando…

Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression

Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction mod...

Descripción completa

Detalles Bibliográficos
Autor principal: Avdeef, Alex
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Association of Physical Chemists 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10567068/
https://www.ncbi.nlm.nih.gov/pubmed/37829322
http://dx.doi.org/10.5599/admet.1879
_version_ 1785119045053317120
author Avdeef, Alex
author_facet Avdeef, Alex
author_sort Avdeef, Alex
collection PubMed
description Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Φ,B) models with the provided best-fit parameters, where Φ is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Φ < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended.
format Online
Article
Text
id pubmed-10567068
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher International Association of Physical Chemists
record_format MEDLINE/PubMed
spelling pubmed-105670682023-10-12 Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression Avdeef, Alex ADMET DMPK Review Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Φ,B) models with the provided best-fit parameters, where Φ is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Φ < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended. International Association of Physical Chemists 2023-08-21 /pmc/articles/PMC10567068/ /pubmed/37829322 http://dx.doi.org/10.5599/admet.1879 Text en Copyright © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ).
spellingShingle Review
Avdeef, Alex
Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
title Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
title_full Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
title_fullStr Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
title_full_unstemmed Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
title_short Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression
title_sort mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (mw<2000) accuracy near that of random forest regression
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10567068/
https://www.ncbi.nlm.nih.gov/pubmed/37829322
http://dx.doi.org/10.5599/admet.1879
work_keys_str_mv AT avdeefalex mechanisticallytransparentmodelsforpredictingaqueoussolubilityofrigidslightlyflexibleandveryflexibledrugsmw2000accuracynearthatofrandomforestregression