Cargando…

Improving predicted protein loop structure ranking using a Pareto-optimality consensus method

BACKGROUND: Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. RESULTS: We have developed a Pareto Optimal...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Yaohang, Rata, Ionel, Chiu, See-wing, Jakobsson, Eric
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2914074/ https://www.ncbi.nlm.nih.gov/pubmed/20642859 http://dx.doi.org/10.1186/1472-6807-10-22

_version_	1782184739425222656
author	Li, Yaohang Rata, Ionel Chiu, See-wing Jakobsson, Eric
author_facet	Li, Yaohang Rata, Ionel Chiu, See-wing Jakobsson, Eric
author_sort	Li, Yaohang
collection	PubMed
description	BACKGROUND: Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. RESULTS: We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. CONCLUSIONS: By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set.
format	Text
id	pubmed-2914074
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29140742010-08-03 Improving predicted protein loop structure ranking using a Pareto-optimality consensus method Li, Yaohang Rata, Ionel Chiu, See-wing Jakobsson, Eric BMC Struct Biol Methodology Article BACKGROUND: Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. RESULTS: We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. CONCLUSIONS: By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set. BioMed Central 2010-07-20 /pmc/articles/PMC2914074/ /pubmed/20642859 http://dx.doi.org/10.1186/1472-6807-10-22 Text en Copyright ©2010 Li et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Li, Yaohang Rata, Ionel Chiu, See-wing Jakobsson, Eric Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
title	Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
title_full	Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
title_fullStr	Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
title_full_unstemmed	Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
title_short	Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
title_sort	improving predicted protein loop structure ranking using a pareto-optimality consensus method
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2914074/ https://www.ncbi.nlm.nih.gov/pubmed/20642859 http://dx.doi.org/10.1186/1472-6807-10-22
work_keys_str_mv	AT liyaohang improvingpredictedproteinloopstructurerankingusingaparetooptimalityconsensusmethod AT rataionel improvingpredictedproteinloopstructurerankingusingaparetooptimalityconsensusmethod AT chiuseewing improvingpredictedproteinloopstructurerankingusingaparetooptimalityconsensusmethod AT jakobssoneric improvingpredictedproteinloopstructurerankingusingaparetooptimalityconsensusmethod

Improving predicted protein loop structure ranking using a Pareto-optimality consensus method

Ejemplares similares