Cargando…
Predicted Biological Activity of Purchasable Chemical Space
[Image: see text] Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical
Society
2017
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5780839/ https://www.ncbi.nlm.nih.gov/pubmed/29193970 http://dx.doi.org/10.1021/acs.jcim.7b00316 |
_version_ | 1783294819524673536 |
---|---|
author | Irwin, John J. Gaskins, Garrett Sterling, Teague Mysinger, Michael M. Keiser, Michael J. |
author_facet | Irwin, John J. Gaskins, Garrett Sterling, Teague Mysinger, Michael M. Keiser, Michael J. |
author_sort | Irwin, John J. |
collection | PubMed |
description | [Image: see text] Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict activity for every commercially available molecule in ZINC. This method, which we label SEA+TC, outperforms both SEA and a naïve-Bayesian classifier via predictive performance on a 5-fold cross-validation of ChEMBL’s bioactivity data set (version 21). Using this method, predictions for over 40% of compounds (>160 million) have either high significance (pSEA ≥ 40), high similarity (ECFP4MaxTc ≥ 0.4), or both, for one or more of 1382 targets well described by ligands in the literature. Using a further 1347 less-well-described targets, we predict activities for an additional 11 million compounds. To gauge whether these predictions are sensible, we investigate 75 predictions for 50 drugs lacking a binding affinity annotation in ChEMBL. The 535 million predictions for over 171 million compounds at 2629 targets are linked to purchasing information and evidence to support each prediction and are freely available via https://zinc15.docking.org and https://files.docking.org. |
format | Online Article Text |
id | pubmed-5780839 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | American Chemical
Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-57808392018-01-25 Predicted Biological Activity of Purchasable Chemical Space Irwin, John J. Gaskins, Garrett Sterling, Teague Mysinger, Michael M. Keiser, Michael J. J Chem Inf Model [Image: see text] Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict activity for every commercially available molecule in ZINC. This method, which we label SEA+TC, outperforms both SEA and a naïve-Bayesian classifier via predictive performance on a 5-fold cross-validation of ChEMBL’s bioactivity data set (version 21). Using this method, predictions for over 40% of compounds (>160 million) have either high significance (pSEA ≥ 40), high similarity (ECFP4MaxTc ≥ 0.4), or both, for one or more of 1382 targets well described by ligands in the literature. Using a further 1347 less-well-described targets, we predict activities for an additional 11 million compounds. To gauge whether these predictions are sensible, we investigate 75 predictions for 50 drugs lacking a binding affinity annotation in ChEMBL. The 535 million predictions for over 171 million compounds at 2629 targets are linked to purchasing information and evidence to support each prediction and are freely available via https://zinc15.docking.org and https://files.docking.org. American Chemical Society 2017-12-01 2018-01-22 /pmc/articles/PMC5780839/ /pubmed/29193970 http://dx.doi.org/10.1021/acs.jcim.7b00316 Text en Copyright © 2017 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes. |
spellingShingle | Irwin, John J. Gaskins, Garrett Sterling, Teague Mysinger, Michael M. Keiser, Michael J. Predicted Biological Activity of Purchasable Chemical Space |
title | Predicted Biological Activity of Purchasable Chemical
Space |
title_full | Predicted Biological Activity of Purchasable Chemical
Space |
title_fullStr | Predicted Biological Activity of Purchasable Chemical
Space |
title_full_unstemmed | Predicted Biological Activity of Purchasable Chemical
Space |
title_short | Predicted Biological Activity of Purchasable Chemical
Space |
title_sort | predicted biological activity of purchasable chemical
space |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5780839/ https://www.ncbi.nlm.nih.gov/pubmed/29193970 http://dx.doi.org/10.1021/acs.jcim.7b00316 |
work_keys_str_mv | AT irwinjohnj predictedbiologicalactivityofpurchasablechemicalspace AT gaskinsgarrett predictedbiologicalactivityofpurchasablechemicalspace AT sterlingteague predictedbiologicalactivityofpurchasablechemicalspace AT mysingermichaelm predictedbiologicalactivityofpurchasablechemicalspace AT keisermichaelj predictedbiologicalactivityofpurchasablechemicalspace |