Cargando…

A generic method for assignment of reliability scores applied to solvent accessibility predictions

BACKGROUND: Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete...

Descripción completa

Detalles Bibliográficos
Autores principales: Petersen, Bent, Petersen, Thomas Nordahl, Andersen, Pernille, Nielsen, Morten, Lundegaard, Claus
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2725087/
https://www.ncbi.nlm.nih.gov/pubmed/19646261
http://dx.doi.org/10.1186/1472-6807-9-51
_version_ 1782170478464466944
author Petersen, Bent
Petersen, Thomas Nordahl
Andersen, Pernille
Nielsen, Morten
Lundegaard, Claus
author_facet Petersen, Bent
Petersen, Thomas Nordahl
Andersen, Pernille
Nielsen, Morten
Lundegaard, Claus
author_sort Petersen, Bent
collection PubMed
description BACKGROUND: Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score. RESULTS: An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output. CONCLUSION: The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.
format Text
id pubmed-2725087
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27250872009-08-12 A generic method for assignment of reliability scores applied to solvent accessibility predictions Petersen, Bent Petersen, Thomas Nordahl Andersen, Pernille Nielsen, Morten Lundegaard, Claus BMC Struct Biol Methodology Article BACKGROUND: Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score. RESULTS: An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output. CONCLUSION: The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset. BioMed Central 2009-07-31 /pmc/articles/PMC2725087/ /pubmed/19646261 http://dx.doi.org/10.1186/1472-6807-9-51 Text en Copyright © 2009 Petersen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Petersen, Bent
Petersen, Thomas Nordahl
Andersen, Pernille
Nielsen, Morten
Lundegaard, Claus
A generic method for assignment of reliability scores applied to solvent accessibility predictions
title A generic method for assignment of reliability scores applied to solvent accessibility predictions
title_full A generic method for assignment of reliability scores applied to solvent accessibility predictions
title_fullStr A generic method for assignment of reliability scores applied to solvent accessibility predictions
title_full_unstemmed A generic method for assignment of reliability scores applied to solvent accessibility predictions
title_short A generic method for assignment of reliability scores applied to solvent accessibility predictions
title_sort generic method for assignment of reliability scores applied to solvent accessibility predictions
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2725087/
https://www.ncbi.nlm.nih.gov/pubmed/19646261
http://dx.doi.org/10.1186/1472-6807-9-51
work_keys_str_mv AT petersenbent agenericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT petersenthomasnordahl agenericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT andersenpernille agenericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT nielsenmorten agenericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT lundegaardclaus agenericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT petersenbent genericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT petersenthomasnordahl genericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT andersenpernille genericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT nielsenmorten genericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions
AT lundegaardclaus genericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions