Cargando…

CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions

[Image: see text] As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined K(d) or K(i) information from the literature. We encouraged the community to score the crystallographic pos...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Richard D., Dunbar, James B., Ung, Peter Man-Un, Esposito, Emilio X., Yang, Chao-Yie, Wang, Shaomeng, Carlson, Heather A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2011
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3186041/
https://www.ncbi.nlm.nih.gov/pubmed/21809884
http://dx.doi.org/10.1021/ci200269q
_version_ 1782213287987904512
author Smith, Richard D.
Dunbar, James B.
Ung, Peter Man-Un
Esposito, Emilio X.
Yang, Chao-Yie
Wang, Shaomeng
Carlson, Heather A.
author_facet Smith, Richard D.
Dunbar, James B.
Ung, Peter Man-Un
Esposito, Emilio X.
Yang, Chao-Yie
Wang, Shaomeng
Carlson, Heather A.
author_sort Smith, Richard D.
collection PubMed
description [Image: see text] As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined K(d) or K(i) information from the literature. We encouraged the community to score the crystallographic poses of the complexes by any method of their choice. The goal of the exercise was to (1) evaluate the current ability of the field to predict activity from structure and (2) investigate the properties of the complexes and methods that appear to hinder scoring. A total of 19 different methods were submitted with numerous parameter variations for a total of 64 sets of scores from 16 participating groups. Linear regression and nonparametric tests were used to correlate scores to the experimental values. Correlation to experiment for the various methods ranged R(2) = 0.58–0.12, Spearman ρ = 0.74–0.37, Kendall τ = 0.55–0.25, and median unsigned error = 1.00–1.68 pK(d) units. All types of scoring functions—force field based, knowledge based, and empirical—had examples with high and low correlation, showing no bias/advantage for any particular approach. The data across all the participants were combined to identify 63 complexes that were poorly scored across the majority of the scoring methods and 123 complexes that were scored well across the majority. The two sets were compared using a Wilcoxon rank-sum test to assess any significant difference in the distributions of >400 physicochemical properties of the ligands and the proteins. Poorly scored complexes were found to have ligands that were the same size as those in well-scored complexes, but hydrogen bonding and torsional strain were significantly different. These comparisons point to a need for CSAR to develop data sets of congeneric series with a range of hydrogen-bonding and hydrophobic characteristics and a range of rotatable bonds.
format Online
Article
Text
id pubmed-3186041
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-31860412011-10-04 CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions Smith, Richard D. Dunbar, James B. Ung, Peter Man-Un Esposito, Emilio X. Yang, Chao-Yie Wang, Shaomeng Carlson, Heather A. J Chem Inf Model [Image: see text] As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined K(d) or K(i) information from the literature. We encouraged the community to score the crystallographic poses of the complexes by any method of their choice. The goal of the exercise was to (1) evaluate the current ability of the field to predict activity from structure and (2) investigate the properties of the complexes and methods that appear to hinder scoring. A total of 19 different methods were submitted with numerous parameter variations for a total of 64 sets of scores from 16 participating groups. Linear regression and nonparametric tests were used to correlate scores to the experimental values. Correlation to experiment for the various methods ranged R(2) = 0.58–0.12, Spearman ρ = 0.74–0.37, Kendall τ = 0.55–0.25, and median unsigned error = 1.00–1.68 pK(d) units. All types of scoring functions—force field based, knowledge based, and empirical—had examples with high and low correlation, showing no bias/advantage for any particular approach. The data across all the participants were combined to identify 63 complexes that were poorly scored across the majority of the scoring methods and 123 complexes that were scored well across the majority. The two sets were compared using a Wilcoxon rank-sum test to assess any significant difference in the distributions of >400 physicochemical properties of the ligands and the proteins. Poorly scored complexes were found to have ligands that were the same size as those in well-scored complexes, but hydrogen bonding and torsional strain were significantly different. These comparisons point to a need for CSAR to develop data sets of congeneric series with a range of hydrogen-bonding and hydrophobic characteristics and a range of rotatable bonds. American Chemical Society 2011-08-03 2011-09-26 /pmc/articles/PMC3186041/ /pubmed/21809884 http://dx.doi.org/10.1021/ci200269q Text en Copyright © 2011 American Chemical Society http://pubs.acs.org This is an open-access article distributed under the ACS AuthorChoice Terms & Conditions. Any use of this article, must conform to the terms of that license which are available at http://pubs.acs.org.
spellingShingle Smith, Richard D.
Dunbar, James B.
Ung, Peter Man-Un
Esposito, Emilio X.
Yang, Chao-Yie
Wang, Shaomeng
Carlson, Heather A.
CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions
title CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions
title_full CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions
title_fullStr CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions
title_full_unstemmed CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions
title_short CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions
title_sort csar benchmark exercise of 2010: combined evaluation across all submitted scoring functions
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3186041/
https://www.ncbi.nlm.nih.gov/pubmed/21809884
http://dx.doi.org/10.1021/ci200269q
work_keys_str_mv AT smithrichardd csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions
AT dunbarjamesb csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions
AT ungpetermanun csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions
AT espositoemiliox csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions
AT yangchaoyie csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions
AT wangshaomeng csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions
AT carlsonheathera csarbenchmarkexerciseof2010combinedevaluationacrossallsubmittedscoringfunctions