Cargando…

How accurate and statistically robust are catalytic site predictions based on closeness centrality?

BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise...

Descripción completa

Detalles Bibliográficos
Autores principales: Chea, Eric, Livesay, Dennis R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876251/
https://www.ncbi.nlm.nih.gov/pubmed/17498304
http://dx.doi.org/10.1186/1471-2105-8-153
_version_ 1782133521051025408
author Chea, Eric
Livesay, Dennis R
author_facet Chea, Eric
Livesay, Dennis R
author_sort Chea, Eric
collection PubMed
description BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices. RESULTS: We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. CONCLUSION: Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation.
format Text
id pubmed-1876251
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18762512007-05-22 How accurate and statistically robust are catalytic site predictions based on closeness centrality? Chea, Eric Livesay, Dennis R BMC Bioinformatics Research Article BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices. RESULTS: We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. CONCLUSION: Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation. BioMed Central 2007-05-11 /pmc/articles/PMC1876251/ /pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153 Text en Copyright © 2007 Chea and Livesay; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chea, Eric
Livesay, Dennis R
How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_full How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_fullStr How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_full_unstemmed How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_short How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_sort how accurate and statistically robust are catalytic site predictions based on closeness centrality?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876251/
https://www.ncbi.nlm.nih.gov/pubmed/17498304
http://dx.doi.org/10.1186/1471-2105-8-153
work_keys_str_mv AT cheaeric howaccurateandstatisticallyrobustarecatalyticsitepredictionsbasedonclosenesscentrality
AT livesaydennisr howaccurateandstatisticallyrobustarecatalyticsitepredictionsbasedonclosenesscentrality