Cargando…

A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability

Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We h...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, Weixing, Guo, Dianjing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651094/
https://www.ncbi.nlm.nih.gov/pubmed/31262005
http://dx.doi.org/10.3390/molecules24132414
_version_ 1783438265629540352
author Dai, Weixing
Guo, Dianjing
author_facet Dai, Weixing
Guo, Dianjing
author_sort Dai, Weixing
collection PubMed
description Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC(50) value of 0.71 µM.
format Online
Article
Text
id pubmed-6651094
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-66510942019-08-07 A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability Dai, Weixing Guo, Dianjing Molecules Article Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC(50) value of 0.71 µM. MDPI 2019-06-30 /pmc/articles/PMC6651094/ /pubmed/31262005 http://dx.doi.org/10.3390/molecules24132414 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Dai, Weixing
Guo, Dianjing
A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
title A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
title_full A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
title_fullStr A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
title_full_unstemmed A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
title_short A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
title_sort ligand-based virtual screening method using direct quantification of generalization ability
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6651094/
https://www.ncbi.nlm.nih.gov/pubmed/31262005
http://dx.doi.org/10.3390/molecules24132414
work_keys_str_mv AT daiweixing aligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability
AT guodianjing aligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability
AT daiweixing ligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability
AT guodianjing ligandbasedvirtualscreeningmethodusingdirectquantificationofgeneralizationability