Cargando…

Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets

BACKGROUND: Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reli...

Descripción completa

Detalles Bibliográficos
Autores principales: Deiana, Antonio, Giansanti, Andrea
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877690/
https://www.ncbi.nlm.nih.gov/pubmed/20409339
http://dx.doi.org/10.1186/1471-2105-11-198
_version_ 1782181801820684288
author Deiana, Antonio
Giansanti, Andrea
author_facet Deiana, Antonio
Giansanti, Andrea
author_sort Deiana, Antonio
collection PubMed
description BACKGROUND: Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. RESULTS: In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score S(SU), that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. CONCLUSIONS: Our results show that proteins unclassified by S(SU )belong to a twilight zone. Proteins left unclassified by the consensus score S(SU )have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated.
format Text
id pubmed-2877690
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28776902010-05-27 Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets Deiana, Antonio Giansanti, Andrea BMC Bioinformatics Methodology article BACKGROUND: Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. RESULTS: In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score S(SU), that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. CONCLUSIONS: Our results show that proteins unclassified by S(SU )belong to a twilight zone. Proteins left unclassified by the consensus score S(SU )have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated. BioMed Central 2010-04-21 /pmc/articles/PMC2877690/ /pubmed/20409339 http://dx.doi.org/10.1186/1471-2105-11-198 Text en Copyright ©2010 Deiana and Giansanti; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology article
Deiana, Antonio
Giansanti, Andrea
Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
title Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
title_full Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
title_fullStr Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
title_full_unstemmed Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
title_short Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
title_sort predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets
topic Methodology article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877690/
https://www.ncbi.nlm.nih.gov/pubmed/20409339
http://dx.doi.org/10.1186/1471-2105-11-198
work_keys_str_mv AT deianaantonio predictorsofnativelyunfoldedproteinsunanimousconsensusscoretodetectatwilightzonebetweenorderanddisorderingenericdatasets
AT giansantiandrea predictorsofnativelyunfoldedproteinsunanimousconsensusscoretodetectatwilightzonebetweenorderanddisorderingenericdatasets