Cargando…
From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction
Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analy...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3749948/ https://www.ncbi.nlm.nih.gov/pubmed/23990764 http://dx.doi.org/10.1371/journal.pcbi.1003176 |
_version_ | 1782477048283922432 |
---|---|
author | Cocco, Simona Monasson, Remi Weigt, Martin |
author_facet | Cocco, Simona Monasson, Remi Weigt, Martin |
author_sort | Cocco, Simona |
collection | PubMed |
description | Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. |
format | Online Article Text |
id | pubmed-3749948 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37499482013-08-29 From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction Cocco, Simona Monasson, Remi Weigt, Martin PLoS Comput Biol Research Article Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. Public Library of Science 2013-08-22 /pmc/articles/PMC3749948/ /pubmed/23990764 http://dx.doi.org/10.1371/journal.pcbi.1003176 Text en © 2013 Cocco et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Cocco, Simona Monasson, Remi Weigt, Martin From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction |
title | From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction |
title_full | From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction |
title_fullStr | From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction |
title_full_unstemmed | From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction |
title_short | From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction |
title_sort | from principal component to direct coupling analysis of coevolution in proteins: low-eigenvalue modes are needed for structure prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3749948/ https://www.ncbi.nlm.nih.gov/pubmed/23990764 http://dx.doi.org/10.1371/journal.pcbi.1003176 |
work_keys_str_mv | AT coccosimona fromprincipalcomponenttodirectcouplinganalysisofcoevolutioninproteinsloweigenvaluemodesareneededforstructureprediction AT monassonremi fromprincipalcomponenttodirectcouplinganalysisofcoevolutioninproteinsloweigenvaluemodesareneededforstructureprediction AT weigtmartin fromprincipalcomponenttodirectcouplinganalysisofcoevolutioninproteinsloweigenvaluemodesareneededforstructureprediction |