Cargando…

How accurate and statistically robust are catalytic site predictions based on closeness centrality?

BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chea, Eric, Livesay, Dennis R
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876251/ https://www.ncbi.nlm.nih.gov/pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153

_version_	1782133521051025408
author	Chea, Eric Livesay, Dennis R
author_facet	Chea, Eric Livesay, Dennis R
author_sort	Chea, Eric
collection	PubMed
description	BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices. RESULTS: We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. CONCLUSION: Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation.
format	Text
id	pubmed-1876251
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18762512007-05-22 How accurate and statistically robust are catalytic site predictions based on closeness centrality? Chea, Eric Livesay, Dennis R BMC Bioinformatics Research Article BACKGROUND: We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices. RESULTS: We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. CONCLUSION: Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation. BioMed Central 2007-05-11 /pmc/articles/PMC1876251/ /pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153 Text en Copyright © 2007 Chea and Livesay; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Chea, Eric Livesay, Dennis R How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title	How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_full	How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_fullStr	How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_full_unstemmed	How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_short	How accurate and statistically robust are catalytic site predictions based on closeness centrality?
title_sort	how accurate and statistically robust are catalytic site predictions based on closeness centrality?
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876251/ https://www.ncbi.nlm.nih.gov/pubmed/17498304 http://dx.doi.org/10.1186/1471-2105-8-153
work_keys_str_mv	AT cheaeric howaccurateandstatisticallyrobustarecatalyticsitepredictionsbasedonclosenesscentrality AT livesaydennisr howaccurateandstatisticallyrobustarecatalyticsitepredictionsbasedonclosenesscentrality

How accurate and statistically robust are catalytic site predictions based on closeness centrality?

Ejemplares similares