Cargando…

Length-dependent prediction of protein intrinsic disorder

BACKGROUND: Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure...

Descripción completa

Detalles Bibliográficos
Autores principales:	Peng, Kang, Radivojac, Predrag, Vucetic, Slobodan, Dunker, A Keith, Obradovic, Zoran
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1479845/ https://www.ncbi.nlm.nih.gov/pubmed/16618368 http://dx.doi.org/10.1186/1471-2105-7-208

_version_	1782128206131757056
author	Peng, Kang Radivojac, Predrag Vucetic, Slobodan Dunker, A Keith Obradovic, Zoran
author_facet	Peng, Kang Radivojac, Predrag Vucetic, Slobodan Dunker, A Keith Obradovic, Zoran
author_sort	Peng, Kang
collection	PubMed
description	BACKGROUND: Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions. RESULTS: We proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (≤30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder. CONCLUSION: The VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use at
format	Text
id	pubmed-1479845
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-14798452006-06-17 Length-dependent prediction of protein intrinsic disorder Peng, Kang Radivojac, Predrag Vucetic, Slobodan Dunker, A Keith Obradovic, Zoran BMC Bioinformatics Software BACKGROUND: Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions. RESULTS: We proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (≤30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder. CONCLUSION: The VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use at BioMed Central 2006-04-17 /pmc/articles/PMC1479845/ /pubmed/16618368 http://dx.doi.org/10.1186/1471-2105-7-208 Text en Copyright © 2006 Peng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Peng, Kang Radivojac, Predrag Vucetic, Slobodan Dunker, A Keith Obradovic, Zoran Length-dependent prediction of protein intrinsic disorder
title	Length-dependent prediction of protein intrinsic disorder
title_full	Length-dependent prediction of protein intrinsic disorder
title_fullStr	Length-dependent prediction of protein intrinsic disorder
title_full_unstemmed	Length-dependent prediction of protein intrinsic disorder
title_short	Length-dependent prediction of protein intrinsic disorder
title_sort	length-dependent prediction of protein intrinsic disorder
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1479845/ https://www.ncbi.nlm.nih.gov/pubmed/16618368 http://dx.doi.org/10.1186/1471-2105-7-208
work_keys_str_mv	AT pengkang lengthdependentpredictionofproteinintrinsicdisorder AT radivojacpredrag lengthdependentpredictionofproteinintrinsicdisorder AT vuceticslobodan lengthdependentpredictionofproteinintrinsicdisorder AT dunkerakeith lengthdependentpredictionofproteinintrinsicdisorder AT obradoviczoran lengthdependentpredictionofproteinintrinsicdisorder

Length-dependent prediction of protein intrinsic disorder

Ejemplares similares