Cargando…

Predicted Biological Activity of Purchasable Chemical Space

[Image: see text] Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity...

Descripción completa

Detalles Bibliográficos
Autores principales: Irwin, John J., Gaskins, Garrett, Sterling, Teague, Mysinger, Michael M., Keiser, Michael J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2017
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5780839/
https://www.ncbi.nlm.nih.gov/pubmed/29193970
http://dx.doi.org/10.1021/acs.jcim.7b00316
_version_ 1783294819524673536
author Irwin, John J.
Gaskins, Garrett
Sterling, Teague
Mysinger, Michael M.
Keiser, Michael J.
author_facet Irwin, John J.
Gaskins, Garrett
Sterling, Teague
Mysinger, Michael M.
Keiser, Michael J.
author_sort Irwin, John J.
collection PubMed
description [Image: see text] Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict activity for every commercially available molecule in ZINC. This method, which we label SEA+TC, outperforms both SEA and a naïve-Bayesian classifier via predictive performance on a 5-fold cross-validation of ChEMBL’s bioactivity data set (version 21). Using this method, predictions for over 40% of compounds (>160 million) have either high significance (pSEA ≥ 40), high similarity (ECFP4MaxTc ≥ 0.4), or both, for one or more of 1382 targets well described by ligands in the literature. Using a further 1347 less-well-described targets, we predict activities for an additional 11 million compounds. To gauge whether these predictions are sensible, we investigate 75 predictions for 50 drugs lacking a binding affinity annotation in ChEMBL. The 535 million predictions for over 171 million compounds at 2629 targets are linked to purchasing information and evidence to support each prediction and are freely available via https://zinc15.docking.org and https://files.docking.org.
format Online
Article
Text
id pubmed-5780839
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-57808392018-01-25 Predicted Biological Activity of Purchasable Chemical Space Irwin, John J. Gaskins, Garrett Sterling, Teague Mysinger, Michael M. Keiser, Michael J. J Chem Inf Model [Image: see text] Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict activity for every commercially available molecule in ZINC. This method, which we label SEA+TC, outperforms both SEA and a naïve-Bayesian classifier via predictive performance on a 5-fold cross-validation of ChEMBL’s bioactivity data set (version 21). Using this method, predictions for over 40% of compounds (>160 million) have either high significance (pSEA ≥ 40), high similarity (ECFP4MaxTc ≥ 0.4), or both, for one or more of 1382 targets well described by ligands in the literature. Using a further 1347 less-well-described targets, we predict activities for an additional 11 million compounds. To gauge whether these predictions are sensible, we investigate 75 predictions for 50 drugs lacking a binding affinity annotation in ChEMBL. The 535 million predictions for over 171 million compounds at 2629 targets are linked to purchasing information and evidence to support each prediction and are freely available via https://zinc15.docking.org and https://files.docking.org. American Chemical Society 2017-12-01 2018-01-22 /pmc/articles/PMC5780839/ /pubmed/29193970 http://dx.doi.org/10.1021/acs.jcim.7b00316 Text en Copyright © 2017 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes.
spellingShingle Irwin, John J.
Gaskins, Garrett
Sterling, Teague
Mysinger, Michael M.
Keiser, Michael J.
Predicted Biological Activity of Purchasable Chemical Space
title Predicted Biological Activity of Purchasable Chemical Space
title_full Predicted Biological Activity of Purchasable Chemical Space
title_fullStr Predicted Biological Activity of Purchasable Chemical Space
title_full_unstemmed Predicted Biological Activity of Purchasable Chemical Space
title_short Predicted Biological Activity of Purchasable Chemical Space
title_sort predicted biological activity of purchasable chemical space
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5780839/
https://www.ncbi.nlm.nih.gov/pubmed/29193970
http://dx.doi.org/10.1021/acs.jcim.7b00316
work_keys_str_mv AT irwinjohnj predictedbiologicalactivityofpurchasablechemicalspace
AT gaskinsgarrett predictedbiologicalactivityofpurchasablechemicalspace
AT sterlingteague predictedbiologicalactivityofpurchasablechemicalspace
AT mysingermichaelm predictedbiologicalactivityofpurchasablechemicalspace
AT keisermichaelj predictedbiologicalactivityofpurchasablechemicalspace