Cargando…

Improving consensus contact prediction via server correlation reduction

BACKGROUND: Protein inter-residue contacts play a crucial role in the determination and prediction of protein structures. Previous studies on contact prediction indicate that although template-based consensus methods outperform sequence-based methods on targets with typical templates, such consensus...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gao, Xin, Bu, Dongbo, Xu, Jinbo, Li, Ming
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2689239/ https://www.ncbi.nlm.nih.gov/pubmed/19419562 http://dx.doi.org/10.1186/1472-6807-9-28

_version_	1782167767548428288
author	Gao, Xin Bu, Dongbo Xu, Jinbo Li, Ming
author_facet	Gao, Xin Bu, Dongbo Xu, Jinbo Li, Ming
author_sort	Gao, Xin
collection	PubMed
description	BACKGROUND: Protein inter-residue contacts play a crucial role in the determination and prediction of protein structures. Previous studies on contact prediction indicate that although template-based consensus methods outperform sequence-based methods on targets with typical templates, such consensus methods perform poorly on new fold targets. However, we find out that even for new fold targets, the models generated by threading programs can contain many true contacts. The challenge is how to identify them. RESULTS: In this paper, we develop an integer linear programming model for consensus contact prediction. In contrast to the simple majority voting method assuming that all the individual servers are equally important and independent, the newly developed method evaluates their correlation by using maximum likelihood estimation and extracts independent latent servers from them by using principal component analysis. An integer linear programming method is then applied to assign a weight to each latent server to maximize the difference between true contacts and false ones. The proposed method is tested on the CASP7 data set. If the top L/5 predicted contacts are evaluated where L is the protein size, the average accuracy is 73%, which is much higher than that of any previously reported study. Moreover, if only the 15 new fold CASP7 targets are considered, our method achieves an average accuracy of 37%, which is much better than that of the majority voting method, SVM-LOMETS, SVM-SEQ, and SAM-T06. These methods demonstrate an average accuracy of 13.0%, 10.8%, 25.8% and 21.2%, respectively. CONCLUSION: Reducing server correlation and optimally combining independent latent servers show a significant improvement over the traditional consensus methods. This approach can hopefully provide a powerful tool for protein structure refinement and prediction use.
format	Text
id	pubmed-2689239
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26892392009-06-02 Improving consensus contact prediction via server correlation reduction Gao, Xin Bu, Dongbo Xu, Jinbo Li, Ming BMC Struct Biol Research Article BACKGROUND: Protein inter-residue contacts play a crucial role in the determination and prediction of protein structures. Previous studies on contact prediction indicate that although template-based consensus methods outperform sequence-based methods on targets with typical templates, such consensus methods perform poorly on new fold targets. However, we find out that even for new fold targets, the models generated by threading programs can contain many true contacts. The challenge is how to identify them. RESULTS: In this paper, we develop an integer linear programming model for consensus contact prediction. In contrast to the simple majority voting method assuming that all the individual servers are equally important and independent, the newly developed method evaluates their correlation by using maximum likelihood estimation and extracts independent latent servers from them by using principal component analysis. An integer linear programming method is then applied to assign a weight to each latent server to maximize the difference between true contacts and false ones. The proposed method is tested on the CASP7 data set. If the top L/5 predicted contacts are evaluated where L is the protein size, the average accuracy is 73%, which is much higher than that of any previously reported study. Moreover, if only the 15 new fold CASP7 targets are considered, our method achieves an average accuracy of 37%, which is much better than that of the majority voting method, SVM-LOMETS, SVM-SEQ, and SAM-T06. These methods demonstrate an average accuracy of 13.0%, 10.8%, 25.8% and 21.2%, respectively. CONCLUSION: Reducing server correlation and optimally combining independent latent servers show a significant improvement over the traditional consensus methods. This approach can hopefully provide a powerful tool for protein structure refinement and prediction use. BioMed Central 2009-05-06 /pmc/articles/PMC2689239/ /pubmed/19419562 http://dx.doi.org/10.1186/1472-6807-9-28 Text en Copyright © 2009 Gao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Gao, Xin Bu, Dongbo Xu, Jinbo Li, Ming Improving consensus contact prediction via server correlation reduction
title	Improving consensus contact prediction via server correlation reduction
title_full	Improving consensus contact prediction via server correlation reduction
title_fullStr	Improving consensus contact prediction via server correlation reduction
title_full_unstemmed	Improving consensus contact prediction via server correlation reduction
title_short	Improving consensus contact prediction via server correlation reduction
title_sort	improving consensus contact prediction via server correlation reduction
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2689239/ https://www.ncbi.nlm.nih.gov/pubmed/19419562 http://dx.doi.org/10.1186/1472-6807-9-28
work_keys_str_mv	AT gaoxin improvingconsensuscontactpredictionviaservercorrelationreduction AT budongbo improvingconsensuscontactpredictionviaservercorrelationreduction AT xujinbo improvingconsensuscontactpredictionviaservercorrelationreduction AT liming improvingconsensuscontactpredictionviaservercorrelationreduction

Improving consensus contact prediction via server correlation reduction

Ejemplares similares