Cargando…

ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery

[Image: see text] Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially acces...

Descripción completa

Detalles Bibliográficos
Autores principales: Tingle, Benjamin I., Tang, Khanh G., Castanon, Mar, Gutierrez, John J., Khurelbaatar, Munkhzul, Dandarchuluun, Chinzorig, Moroz, Yurii S., Irwin, John J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976280/
https://www.ncbi.nlm.nih.gov/pubmed/36790087
http://dx.doi.org/10.1021/acs.jcim.2c01253
_version_ 1784899027192512512
author Tingle, Benjamin I.
Tang, Khanh G.
Castanon, Mar
Gutierrez, John J.
Khurelbaatar, Munkhzul
Dandarchuluun, Chinzorig
Moroz, Yurii S.
Irwin, John J.
author_facet Tingle, Benjamin I.
Tang, Khanh G.
Castanon, Mar
Gutierrez, John J.
Khurelbaatar, Munkhzul
Dandarchuluun, Chinzorig
Moroz, Yurii S.
Irwin, John J.
author_sort Tingle, Benjamin I.
collection PubMed
description [Image: see text] Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis–Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.
format Online
Article
Text
id pubmed-9976280
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-99762802023-03-02 ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery Tingle, Benjamin I. Tang, Khanh G. Castanon, Mar Gutierrez, John J. Khurelbaatar, Munkhzul Dandarchuluun, Chinzorig Moroz, Yurii S. Irwin, John J. J Chem Inf Model [Image: see text] Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis–Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds. American Chemical Society 2023-02-15 /pmc/articles/PMC9976280/ /pubmed/36790087 http://dx.doi.org/10.1021/acs.jcim.2c01253 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Tingle, Benjamin I.
Tang, Khanh G.
Castanon, Mar
Gutierrez, John J.
Khurelbaatar, Munkhzul
Dandarchuluun, Chinzorig
Moroz, Yurii S.
Irwin, John J.
ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery
title ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery
title_full ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery
title_fullStr ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery
title_full_unstemmed ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery
title_short ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery
title_sort zinc-22—a free multi-billion-scale database of tangible compounds for ligand discovery
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976280/
https://www.ncbi.nlm.nih.gov/pubmed/36790087
http://dx.doi.org/10.1021/acs.jcim.2c01253
work_keys_str_mv AT tinglebenjamini zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT tangkhanhg zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT castanonmar zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT gutierrezjohnj zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT khurelbaatarmunkhzul zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT dandarchuluunchinzorig zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT morozyuriis zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery
AT irwinjohnj zinc22afreemultibillionscaledatabaseoftangiblecompoundsforliganddiscovery