Cargando…
A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We h...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651094/ https://www.ncbi.nlm.nih.gov/pubmed/31262005 http://dx.doi.org/10.3390/molecules24132414 |
_version_ | 1783438265629540352 |
---|---|
author | Dai, Weixing Guo, Dianjing |
author_facet | Dai, Weixing Guo, Dianjing |
author_sort | Dai, Weixing |
collection | PubMed |
description | Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC(50) value of 0.71 µM. |
format | Online Article Text |
id | pubmed-6651094 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-66510942019-08-07 A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability Dai, Weixing Guo, Dianjing Molecules Article Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC(50) value of 0.71 µM. MDPI 2019-06-30 /pmc/articles/PMC6651094/ /pubmed/31262005 http://dx.doi.org/10.3390/molecules24132414 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Dai, Weixing Guo, Dianjing A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability |
title | A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability |
title_full | A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability |
title_fullStr | A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability |
title_full_unstemmed | A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability |
title_short | A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability |
title_sort | ligand-based virtual screening method using direct quantification of generalization ability |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651094/ https://www.ncbi.nlm.nih.gov/pubmed/31262005 http://dx.doi.org/10.3390/molecules24132414 |
work_keys_str_mv | AT daiweixing aligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability AT guodianjing aligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability AT daiweixing ligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability AT guodianjing ligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability |