Cargando…

Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset

BACKGROUND: Algorithms designed to predict protein disorder play an important role in structural and functional genomics, as disordered regions have been reported to participate in important cellular processes. Consequently, several methods with different underlying principles for disorder predictio...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sirota, Fernanda L, Ooi, Hong-Sain, Gattermayer, Tobias, Schneider, Georg, Eisenhaber, Frank, Maurer-Stroh, Sebastian
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2822529/ https://www.ncbi.nlm.nih.gov/pubmed/20158872 http://dx.doi.org/10.1186/1471-2164-11-S1-S15

_version_	1782177534038769664
author	Sirota, Fernanda L Ooi, Hong-Sain Gattermayer, Tobias Schneider, Georg Eisenhaber, Frank Maurer-Stroh, Sebastian
author_facet	Sirota, Fernanda L Ooi, Hong-Sain Gattermayer, Tobias Schneider, Georg Eisenhaber, Frank Maurer-Stroh, Sebastian
author_sort	Sirota, Fernanda L
collection	PubMed
description	BACKGROUND: Algorithms designed to predict protein disorder play an important role in structural and functional genomics, as disordered regions have been reported to participate in important cellular processes. Consequently, several methods with different underlying principles for disorder prediction have been independently developed by various groups. For assessing their usability in automated workflows, we are interested in identifying parameter settings and threshold selections, under which the performance of these predictors becomes directly comparable. RESULTS: First, we derived a new benchmark set that accounts for different flavours of disorder complemented with a similar amount of order annotation derived for the same protein set. We show that, using the recommended default parameters, the programs tested are producing a wide range of predictions at different levels of specificity and sensitivity. We identify settings, in which the different predictors have the same false positive rate. We assess conditions when sets of predictors can be run together to derive consensus or complementary predictions. This is useful in the framework of proteome-wide applications where high specificity is required such as in our in-house sequence analysis pipeline and the ANNIE webserver. CONCLUSIONS: This work identifies parameter settings and thresholds for a selection of disorder predictors to produce comparable results at a desired level of specificity over a newly derived benchmark dataset that accounts equally for ordered and disordered regions of different lengths.
format	Text
id	pubmed-2822529
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-28225292010-02-17 Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset Sirota, Fernanda L Ooi, Hong-Sain Gattermayer, Tobias Schneider, Georg Eisenhaber, Frank Maurer-Stroh, Sebastian BMC Genomics Research BACKGROUND: Algorithms designed to predict protein disorder play an important role in structural and functional genomics, as disordered regions have been reported to participate in important cellular processes. Consequently, several methods with different underlying principles for disorder prediction have been independently developed by various groups. For assessing their usability in automated workflows, we are interested in identifying parameter settings and threshold selections, under which the performance of these predictors becomes directly comparable. RESULTS: First, we derived a new benchmark set that accounts for different flavours of disorder complemented with a similar amount of order annotation derived for the same protein set. We show that, using the recommended default parameters, the programs tested are producing a wide range of predictions at different levels of specificity and sensitivity. We identify settings, in which the different predictors have the same false positive rate. We assess conditions when sets of predictors can be run together to derive consensus or complementary predictions. This is useful in the framework of proteome-wide applications where high specificity is required such as in our in-house sequence analysis pipeline and the ANNIE webserver. CONCLUSIONS: This work identifies parameter settings and thresholds for a selection of disorder predictors to produce comparable results at a desired level of specificity over a newly derived benchmark dataset that accounts equally for ordered and disordered regions of different lengths. BioMed Central 2010-02-10 /pmc/articles/PMC2822529/ /pubmed/20158872 http://dx.doi.org/10.1186/1471-2164-11-S1-S15 Text en Copyright ©2010 Sirota et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Sirota, Fernanda L Ooi, Hong-Sain Gattermayer, Tobias Schneider, Georg Eisenhaber, Frank Maurer-Stroh, Sebastian Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
title	Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
title_full	Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
title_fullStr	Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
title_full_unstemmed	Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
title_short	Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
title_sort	parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2822529/ https://www.ncbi.nlm.nih.gov/pubmed/20158872 http://dx.doi.org/10.1186/1471-2164-11-S1-S15
work_keys_str_mv	AT sirotafernandal parameterizationofdisorderpredictorsforlargescaleapplicationsrequiringhighspecificitybyusinganextendedbenchmarkdataset AT ooihongsain parameterizationofdisorderpredictorsforlargescaleapplicationsrequiringhighspecificitybyusinganextendedbenchmarkdataset AT gattermayertobias parameterizationofdisorderpredictorsforlargescaleapplicationsrequiringhighspecificitybyusinganextendedbenchmarkdataset AT schneidergeorg parameterizationofdisorderpredictorsforlargescaleapplicationsrequiringhighspecificitybyusinganextendedbenchmarkdataset AT eisenhaberfrank parameterizationofdisorderpredictorsforlargescaleapplicationsrequiringhighspecificitybyusinganextendedbenchmarkdataset AT maurerstrohsebastian parameterizationofdisorderpredictorsforlargescaleapplicationsrequiringhighspecificitybyusinganextendedbenchmarkdataset

Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset

Ejemplares similares