Cargando…
Machine Learning-Boosted Docking Enables the Efficient Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical Libraries
[Image: see text] The emergence of ultra-large screening libraries, filled to the brim with billions of readily available compounds, poses a growing challenge for docking-based virtual screening. Machine learning (ML)-boosted strategies like the tool HASTEN combine rapid ML prediction with the brute...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523430/ https://www.ncbi.nlm.nih.gov/pubmed/37655823 http://dx.doi.org/10.1021/acs.jcim.3c01239 |
_version_ | 1785110564578525184 |
---|---|
author | Sivula, Toni Yetukuri, Laxman Kalliokoski, Tuomo Käsnänen, Heikki Poso, Antti Pöhner, Ina |
author_facet | Sivula, Toni Yetukuri, Laxman Kalliokoski, Tuomo Käsnänen, Heikki Poso, Antti Pöhner, Ina |
author_sort | Sivula, Toni |
collection | PubMed |
description | [Image: see text] The emergence of ultra-large screening libraries, filled to the brim with billions of readily available compounds, poses a growing challenge for docking-based virtual screening. Machine learning (ML)-boosted strategies like the tool HASTEN combine rapid ML prediction with the brute-force docking of small fractions of such libraries to increase screening throughput and take on giga-scale libraries. In our case study of an anti-bacterial chaperone and an anti-viral kinase, we first generated a brute-force docking baseline for 1.56 billion compounds in the Enamine REAL lead-like library with the fast Glide high-throughput virtual screening protocol. With HASTEN, we observed robust recall of 90% of the true 1000 top-scoring virtual hits in both targets when docking only 1% of the entire library. This reduction of the required docking experiments by 99% significantly shortens the screening time. In the kinase target, the employment of a hydrogen bonding constraint resulted in a major proportion of unsuccessful docking attempts and hampered ML predictions. We demonstrate the optimization potential in the treatment of failed compounds when performing ML-boosted screening and benchmark and showcase HASTEN as a fast and robust tool in a growing arsenal of approaches to unlock the chemical space covered by giga-scale screening libraries for everyday drug discovery campaigns. |
format | Online Article Text |
id | pubmed-10523430 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-105234302023-09-28 Machine Learning-Boosted Docking Enables the Efficient Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical Libraries Sivula, Toni Yetukuri, Laxman Kalliokoski, Tuomo Käsnänen, Heikki Poso, Antti Pöhner, Ina J Chem Inf Model [Image: see text] The emergence of ultra-large screening libraries, filled to the brim with billions of readily available compounds, poses a growing challenge for docking-based virtual screening. Machine learning (ML)-boosted strategies like the tool HASTEN combine rapid ML prediction with the brute-force docking of small fractions of such libraries to increase screening throughput and take on giga-scale libraries. In our case study of an anti-bacterial chaperone and an anti-viral kinase, we first generated a brute-force docking baseline for 1.56 billion compounds in the Enamine REAL lead-like library with the fast Glide high-throughput virtual screening protocol. With HASTEN, we observed robust recall of 90% of the true 1000 top-scoring virtual hits in both targets when docking only 1% of the entire library. This reduction of the required docking experiments by 99% significantly shortens the screening time. In the kinase target, the employment of a hydrogen bonding constraint resulted in a major proportion of unsuccessful docking attempts and hampered ML predictions. We demonstrate the optimization potential in the treatment of failed compounds when performing ML-boosted screening and benchmark and showcase HASTEN as a fast and robust tool in a growing arsenal of approaches to unlock the chemical space covered by giga-scale screening libraries for everyday drug discovery campaigns. American Chemical Society 2023-09-01 /pmc/articles/PMC10523430/ /pubmed/37655823 http://dx.doi.org/10.1021/acs.jcim.3c01239 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Sivula, Toni Yetukuri, Laxman Kalliokoski, Tuomo Käsnänen, Heikki Poso, Antti Pöhner, Ina Machine Learning-Boosted Docking Enables the Efficient Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical Libraries |
title | Machine Learning-Boosted
Docking Enables the Efficient
Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical
Libraries |
title_full | Machine Learning-Boosted
Docking Enables the Efficient
Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical
Libraries |
title_fullStr | Machine Learning-Boosted
Docking Enables the Efficient
Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical
Libraries |
title_full_unstemmed | Machine Learning-Boosted
Docking Enables the Efficient
Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical
Libraries |
title_short | Machine Learning-Boosted
Docking Enables the Efficient
Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical
Libraries |
title_sort | machine learning-boosted
docking enables the efficient
structure-based virtual screening of giga-scale enumerated chemical
libraries |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523430/ https://www.ncbi.nlm.nih.gov/pubmed/37655823 http://dx.doi.org/10.1021/acs.jcim.3c01239 |
work_keys_str_mv | AT sivulatoni machinelearningboosteddockingenablestheefficientstructurebasedvirtualscreeningofgigascaleenumeratedchemicallibraries AT yetukurilaxman machinelearningboosteddockingenablestheefficientstructurebasedvirtualscreeningofgigascaleenumeratedchemicallibraries AT kalliokoskituomo machinelearningboosteddockingenablestheefficientstructurebasedvirtualscreeningofgigascaleenumeratedchemicallibraries AT kasnanenheikki machinelearningboosteddockingenablestheefficientstructurebasedvirtualscreeningofgigascaleenumeratedchemicallibraries AT posoantti machinelearningboosteddockingenablestheefficientstructurebasedvirtualscreeningofgigascaleenumeratedchemicallibraries AT pohnerina machinelearningboosteddockingenablestheefficientstructurebasedvirtualscreeningofgigascaleenumeratedchemicallibraries |