Cargando…

Accelerating high-throughput virtual screening through molecular pool-based active learning

Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As virtual libraries continue to grow (in excess of 10(8) molecules), so too do the resources necessary to conduct exhaustive virtual scre...

Descripción completa

Detalles Bibliográficos
Autores principales:	Graff, David E., Shakhnovich, Eugene I., Coley, Connor W.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	The Royal Society of Chemistry 2021
Materias:	Chemistry
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8188596/ https://www.ncbi.nlm.nih.gov/pubmed/34168840 http://dx.doi.org/10.1039/d0sc06805e

_version_	1783705360312303616
author	Graff, David E. Shakhnovich, Eugene I. Coley, Connor W.
author_facet	Graff, David E. Shakhnovich, Eugene I. Coley, Connor W.
author_sort	Graff, David E.
collection	PubMed
description	Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As virtual libraries continue to grow (in excess of 10(8) molecules), so too do the resources necessary to conduct exhaustive virtual screening campaigns on these libraries. However, Bayesian optimization techniques, previously employed in other scientific discovery problems, can aid in their exploration: a surrogate structure–property relationship model trained on the predicted affinities of a subset of the library can be applied to the remaining library members, allowing the least promising compounds to be excluded from evaluation. In this study, we explore the application of these techniques to computational docking datasets and assess the impact of surrogate model architecture, acquisition function, and acquisition batch size on optimization performance. We observe significant reductions in computational costs; for example, using a directed-message passing neural network we can identify 94.8% or 89.3% of the top-50 000 ligands in a 100M member library after testing only 2.4% of candidate ligands using an upper confidence bound or greedy acquisition strategy, respectively. Such model-guided searches mitigate the increasing computational costs of screening increasingly large virtual libraries and can accelerate high-throughput virtual screening campaigns with applications beyond docking.
format	Online Article Text
id	pubmed-8188596
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	The Royal Society of Chemistry
record_format	MEDLINE/PubMed
spelling	pubmed-81885962021-06-23 Accelerating high-throughput virtual screening through molecular pool-based active learning Graff, David E. Shakhnovich, Eugene I. Coley, Connor W. Chem Sci Chemistry Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As virtual libraries continue to grow (in excess of 10(8) molecules), so too do the resources necessary to conduct exhaustive virtual screening campaigns on these libraries. However, Bayesian optimization techniques, previously employed in other scientific discovery problems, can aid in their exploration: a surrogate structure–property relationship model trained on the predicted affinities of a subset of the library can be applied to the remaining library members, allowing the least promising compounds to be excluded from evaluation. In this study, we explore the application of these techniques to computational docking datasets and assess the impact of surrogate model architecture, acquisition function, and acquisition batch size on optimization performance. We observe significant reductions in computational costs; for example, using a directed-message passing neural network we can identify 94.8% or 89.3% of the top-50 000 ligands in a 100M member library after testing only 2.4% of candidate ligands using an upper confidence bound or greedy acquisition strategy, respectively. Such model-guided searches mitigate the increasing computational costs of screening increasingly large virtual libraries and can accelerate high-throughput virtual screening campaigns with applications beyond docking. The Royal Society of Chemistry 2021-04-29 /pmc/articles/PMC8188596/ /pubmed/34168840 http://dx.doi.org/10.1039/d0sc06805e Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle	Chemistry Graff, David E. Shakhnovich, Eugene I. Coley, Connor W. Accelerating high-throughput virtual screening through molecular pool-based active learning
title	Accelerating high-throughput virtual screening through molecular pool-based active learning
title_full	Accelerating high-throughput virtual screening through molecular pool-based active learning
title_fullStr	Accelerating high-throughput virtual screening through molecular pool-based active learning
title_full_unstemmed	Accelerating high-throughput virtual screening through molecular pool-based active learning
title_short	Accelerating high-throughput virtual screening through molecular pool-based active learning
title_sort	accelerating high-throughput virtual screening through molecular pool-based active learning
topic	Chemistry
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8188596/ https://www.ncbi.nlm.nih.gov/pubmed/34168840 http://dx.doi.org/10.1039/d0sc06805e
work_keys_str_mv	AT graffdavide acceleratinghighthroughputvirtualscreeningthroughmolecularpoolbasedactivelearning AT shakhnovicheugenei acceleratinghighthroughputvirtualscreeningthroughmolecularpoolbasedactivelearning AT coleyconnorw acceleratinghighthroughputvirtualscreeningthroughmolecularpoolbasedactivelearning

Accelerating high-throughput virtual screening through molecular pool-based active learning

Ejemplares similares