Cargando…

A two-stage approach for improved prediction of residue contact maps

BACKGROUND: Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure. Although improvements have occurred over the last years, the problem of accurately predicting residue contact maps from primary sequences is...

Descripción completa

Detalles Bibliográficos
Autores principales:	Vullo, Alessandro, Walsh, Ian, Pollastri, Gianluca
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1484494/ https://www.ncbi.nlm.nih.gov/pubmed/16573808 http://dx.doi.org/10.1186/1471-2105-7-180

_version_	1782128334482702336
author	Vullo, Alessandro Walsh, Ian Pollastri, Gianluca
author_facet	Vullo, Alessandro Walsh, Ian Pollastri, Gianluca
author_sort	Vullo, Alessandro
collection	PubMed
description	BACKGROUND: Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure. Although improvements have occurred over the last years, the problem of accurately predicting residue contact maps from primary sequences is still largely unsolved. Among the reasons for this are the unbalanced nature of the problem (with far fewer examples of contacts than non-contacts), the formidable challenge of capturing long-range interactions in the maps, the intrinsic difficulty of mapping one-dimensional input sequences into two-dimensional output maps. In order to alleviate these problems and achieve improved contact map predictions, in this paper we split the task into two stages: the prediction of a map's principal eigenvector (PE) from the primary sequence; the reconstruction of the contact map from the PE and primary sequence. Predicting the PE from the primary sequence consists in mapping a vector into a vector. This task is less complex than mapping vectors directly into two-dimensional matrices since the size of the problem is drastically reduced and so is the scale length of interactions that need to be learned. RESULTS: We develop architectures composed of ensembles of two-layered bidirectional recurrent neural networks to classify the components of the PE in 2, 3 and 4 classes from protein primary sequence, predicted secondary structure, and hydrophobicity interaction scales. Our predictor, tested on a non redundant set of 2171 proteins, achieves classification performances of up to 72.6%, 16% above a base-line statistical predictor. We design a system for the prediction of contact maps from the predicted PE. Our results show that predicting maps through the PE yields sizeable gains especially for long-range contacts which are particularly critical for accurate protein 3D reconstruction. The final predictor's accuracy on a non-redundant set of 327 targets is 35.4% and 19.8% for minimum contact separations of 12 and 24, respectively, when the top length/5 contacts are selected. On the 11 CASP6 Novel Fold targets we achieve similar accuracies (36.5% and 19.7%). This favourably compares with the best automated predictors at CASP6. CONCLUSION: Our final system for contact map prediction achieves state-of-the-art performances, and may provide valuable constraints for improved ab initio prediction of protein structures. A suite of predictors of structural features, including the PE, and PE-based contact maps, is available at .
format	Text
id	pubmed-1484494
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-14844942006-07-10 A two-stage approach for improved prediction of residue contact maps Vullo, Alessandro Walsh, Ian Pollastri, Gianluca BMC Bioinformatics Research Article BACKGROUND: Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure. Although improvements have occurred over the last years, the problem of accurately predicting residue contact maps from primary sequences is still largely unsolved. Among the reasons for this are the unbalanced nature of the problem (with far fewer examples of contacts than non-contacts), the formidable challenge of capturing long-range interactions in the maps, the intrinsic difficulty of mapping one-dimensional input sequences into two-dimensional output maps. In order to alleviate these problems and achieve improved contact map predictions, in this paper we split the task into two stages: the prediction of a map's principal eigenvector (PE) from the primary sequence; the reconstruction of the contact map from the PE and primary sequence. Predicting the PE from the primary sequence consists in mapping a vector into a vector. This task is less complex than mapping vectors directly into two-dimensional matrices since the size of the problem is drastically reduced and so is the scale length of interactions that need to be learned. RESULTS: We develop architectures composed of ensembles of two-layered bidirectional recurrent neural networks to classify the components of the PE in 2, 3 and 4 classes from protein primary sequence, predicted secondary structure, and hydrophobicity interaction scales. Our predictor, tested on a non redundant set of 2171 proteins, achieves classification performances of up to 72.6%, 16% above a base-line statistical predictor. We design a system for the prediction of contact maps from the predicted PE. Our results show that predicting maps through the PE yields sizeable gains especially for long-range contacts which are particularly critical for accurate protein 3D reconstruction. The final predictor's accuracy on a non-redundant set of 327 targets is 35.4% and 19.8% for minimum contact separations of 12 and 24, respectively, when the top length/5 contacts are selected. On the 11 CASP6 Novel Fold targets we achieve similar accuracies (36.5% and 19.7%). This favourably compares with the best automated predictors at CASP6. CONCLUSION: Our final system for contact map prediction achieves state-of-the-art performances, and may provide valuable constraints for improved ab initio prediction of protein structures. A suite of predictors of structural features, including the PE, and PE-based contact maps, is available at . BioMed Central 2006-03-30 /pmc/articles/PMC1484494/ /pubmed/16573808 http://dx.doi.org/10.1186/1471-2105-7-180 Text en Copyright © 2006 Vullo et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Vullo, Alessandro Walsh, Ian Pollastri, Gianluca A two-stage approach for improved prediction of residue contact maps
title	A two-stage approach for improved prediction of residue contact maps
title_full	A two-stage approach for improved prediction of residue contact maps
title_fullStr	A two-stage approach for improved prediction of residue contact maps
title_full_unstemmed	A two-stage approach for improved prediction of residue contact maps
title_short	A two-stage approach for improved prediction of residue contact maps
title_sort	two-stage approach for improved prediction of residue contact maps
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1484494/ https://www.ncbi.nlm.nih.gov/pubmed/16573808 http://dx.doi.org/10.1186/1471-2105-7-180
work_keys_str_mv	AT vulloalessandro atwostageapproachforimprovedpredictionofresiduecontactmaps AT walshian atwostageapproachforimprovedpredictionofresiduecontactmaps AT pollastrigianluca atwostageapproachforimprovedpredictionofresiduecontactmaps AT vulloalessandro twostageapproachforimprovedpredictionofresiduecontactmaps AT walshian twostageapproachforimprovedpredictionofresiduecontactmaps AT pollastrigianluca twostageapproachforimprovedpredictionofresiduecontactmaps

A two-stage approach for improved prediction of residue contact maps

Ejemplares similares