Cargando…

Structure-Based Predictive Models for Allosteric Hot Spots

In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data s...

Descripción completa

Detalles Bibliográficos
Autores principales: Demerdash, Omar N. A., Daily, Michael D., Mitchell, Julie C.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2748687/
https://www.ncbi.nlm.nih.gov/pubmed/19816556
http://dx.doi.org/10.1371/journal.pcbi.1000531
Descripción
Sumario:In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray [1]. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68–81% of known hotspots, and among total hotspot predictions, 58–67% were actual hotspots. Hence, these models have precision P = 58–67% and recall R = 68–81%. The corresponding models for Feature Set 2 had P = 55–59% and R = 81–92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73–81% and P = 64–71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues.