Cargando…
Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm
[Image: see text] Aqueous solubility is one of the most important physicochemical properties in drug discovery. At present, the prediction of aqueous solubility of compounds is still a challenging problem. Machine learning has shown great potential in solubility prediction. Most machine learning mod...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2022
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9685740/ https://www.ncbi.nlm.nih.gov/pubmed/36440111 http://dx.doi.org/10.1021/acsomega.2c03885 |
_version_ | 1784835579400159232 |
---|---|
author | Li, Mengshan Chen, Huijie Zhang, Hang Zeng, Ming Chen, Bingsheng Guan, Lixin |
author_facet | Li, Mengshan Chen, Huijie Zhang, Hang Zeng, Ming Chen, Bingsheng Guan, Lixin |
author_sort | Li, Mengshan |
collection | PubMed |
description | [Image: see text] Aqueous solubility is one of the most important physicochemical properties in drug discovery. At present, the prediction of aqueous solubility of compounds is still a challenging problem. Machine learning has shown great potential in solubility prediction. Most machine learning models largely rely on the setting of hyperparameters, and their performance can be improved by setting the hyperparameters in a better way. In this paper, we used MACCS fingerprints to represent the structural features and optimized the hyperparameters of the light gradient boosting machine (LightGBM) with the cuckoo search algorithm (CS). Based on the above representation and optimization, the CS-LightGBM model was established to predict the aqueous solubility of 2446 organic compounds and the obtained prediction results were compared with those obtained with the other six different machine learning models (RF, GBDT, XGBoost, LightGBM, SVR, and BO-LightGBM). The comparison results showed that the CS-LightGBM model had a better prediction performance than the other six different models. RMSE, MAE, and R(2) of the CS-LightGBM model were, respectively, 0.7785, 0.5117, and 0.8575. In addition, this model has good scalability and can be used to solve solubility prediction problems in other fields such as solvent selection and drug screening. |
format | Online Article Text |
id | pubmed-9685740 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-96857402022-11-25 Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm Li, Mengshan Chen, Huijie Zhang, Hang Zeng, Ming Chen, Bingsheng Guan, Lixin ACS Omega [Image: see text] Aqueous solubility is one of the most important physicochemical properties in drug discovery. At present, the prediction of aqueous solubility of compounds is still a challenging problem. Machine learning has shown great potential in solubility prediction. Most machine learning models largely rely on the setting of hyperparameters, and their performance can be improved by setting the hyperparameters in a better way. In this paper, we used MACCS fingerprints to represent the structural features and optimized the hyperparameters of the light gradient boosting machine (LightGBM) with the cuckoo search algorithm (CS). Based on the above representation and optimization, the CS-LightGBM model was established to predict the aqueous solubility of 2446 organic compounds and the obtained prediction results were compared with those obtained with the other six different machine learning models (RF, GBDT, XGBoost, LightGBM, SVR, and BO-LightGBM). The comparison results showed that the CS-LightGBM model had a better prediction performance than the other six different models. RMSE, MAE, and R(2) of the CS-LightGBM model were, respectively, 0.7785, 0.5117, and 0.8575. In addition, this model has good scalability and can be used to solve solubility prediction problems in other fields such as solvent selection and drug screening. American Chemical Society 2022-11-08 /pmc/articles/PMC9685740/ /pubmed/36440111 http://dx.doi.org/10.1021/acsomega.2c03885 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Li, Mengshan Chen, Huijie Zhang, Hang Zeng, Ming Chen, Bingsheng Guan, Lixin Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm |
title | Prediction of the
Aqueous Solubility of Compounds
Based on Light Gradient Boosting Machines with Molecular Fingerprints
and the Cuckoo Search Algorithm |
title_full | Prediction of the
Aqueous Solubility of Compounds
Based on Light Gradient Boosting Machines with Molecular Fingerprints
and the Cuckoo Search Algorithm |
title_fullStr | Prediction of the
Aqueous Solubility of Compounds
Based on Light Gradient Boosting Machines with Molecular Fingerprints
and the Cuckoo Search Algorithm |
title_full_unstemmed | Prediction of the
Aqueous Solubility of Compounds
Based on Light Gradient Boosting Machines with Molecular Fingerprints
and the Cuckoo Search Algorithm |
title_short | Prediction of the
Aqueous Solubility of Compounds
Based on Light Gradient Boosting Machines with Molecular Fingerprints
and the Cuckoo Search Algorithm |
title_sort | prediction of the
aqueous solubility of compounds
based on light gradient boosting machines with molecular fingerprints
and the cuckoo search algorithm |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9685740/ https://www.ncbi.nlm.nih.gov/pubmed/36440111 http://dx.doi.org/10.1021/acsomega.2c03885 |
work_keys_str_mv | AT limengshan predictionoftheaqueoussolubilityofcompoundsbasedonlightgradientboostingmachineswithmolecularfingerprintsandthecuckoosearchalgorithm AT chenhuijie predictionoftheaqueoussolubilityofcompoundsbasedonlightgradientboostingmachineswithmolecularfingerprintsandthecuckoosearchalgorithm AT zhanghang predictionoftheaqueoussolubilityofcompoundsbasedonlightgradientboostingmachineswithmolecularfingerprintsandthecuckoosearchalgorithm AT zengming predictionoftheaqueoussolubilityofcompoundsbasedonlightgradientboostingmachineswithmolecularfingerprintsandthecuckoosearchalgorithm AT chenbingsheng predictionoftheaqueoussolubilityofcompoundsbasedonlightgradientboostingmachineswithmolecularfingerprintsandthecuckoosearchalgorithm AT guanlixin predictionoftheaqueoussolubilityofcompoundsbasedonlightgradientboostingmachineswithmolecularfingerprintsandthecuckoosearchalgorithm |