Cargando…
How accurate and statistically robust are catalytic site predictions based on closeness centrality?
BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876251/ https://www.ncbi.nlm.nih.gov/pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153 |
_version_ | 1782133521051025408 |
---|---|
author | Chea, Eric Livesay, Dennis R |
author_facet | Chea, Eric Livesay, Dennis R |
author_sort | Chea, Eric |
collection | PubMed |
description | BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices. RESULTS: We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. CONCLUSION: Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation. |
format | Text |
id | pubmed-1876251 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18762512007-05-22 How accurate and statistically robust are catalytic site predictions based on closeness centrality? Chea, Eric Livesay, Dennis R BMC Bioinformatics Research Article BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices. RESULTS: We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. CONCLUSION: Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation. BioMed Central 2007-05-11 /pmc/articles/PMC1876251/ /pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153 Text en Copyright © 2007 Chea and Livesay; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Chea, Eric Livesay, Dennis R How accurate and statistically robust are catalytic site predictions based on closeness centrality? |
title | How accurate and statistically robust are catalytic site predictions based on closeness centrality? |
title_full | How accurate and statistically robust are catalytic site predictions based on closeness centrality? |
title_fullStr | How accurate and statistically robust are catalytic site predictions based on closeness centrality? |
title_full_unstemmed | How accurate and statistically robust are catalytic site predictions based on closeness centrality? |
title_short | How accurate and statistically robust are catalytic site predictions based on closeness centrality? |
title_sort | how accurate and statistically robust are catalytic site predictions based on closeness centrality? |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876251/ https://www.ncbi.nlm.nih.gov/pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153 |
work_keys_str_mv | AT cheaeric howaccurateandstatisticallyrobustarecatalyticsitepredictionsbasedonclosenesscentrality AT livesaydennisr howaccurateandstatisticallyrobustarecatalyticsitepredictionsbasedonclosenesscentrality |