Cargando…

Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models

Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structu...

Descripción completa

Detalles Bibliográficos
Autores principales: Jacquin, Hugo, Gilson, Amy, Shakhnovich, Eugene, Cocco, Simona, Monasson, Rémi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4866778/
https://www.ncbi.nlm.nih.gov/pubmed/27177270
http://dx.doi.org/10.1371/journal.pcbi.1004889
_version_ 1782431968350175232
author Jacquin, Hugo
Gilson, Amy
Shakhnovich, Eugene
Cocco, Simona
Monasson, Rémi
author_facet Jacquin, Hugo
Gilson, Amy
Shakhnovich, Eugene
Cocco, Simona
Monasson, Rémi
author_sort Jacquin, Hugo
collection PubMed
description Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of ‘true’ LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.
format Online
Article
Text
id pubmed-4866778
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-48667782016-05-18 Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models Jacquin, Hugo Gilson, Amy Shakhnovich, Eugene Cocco, Simona Monasson, Rémi PLoS Comput Biol Research Article Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of ‘true’ LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations. Public Library of Science 2016-05-13 /pmc/articles/PMC4866778/ /pubmed/27177270 http://dx.doi.org/10.1371/journal.pcbi.1004889 Text en © 2016 Jacquin et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Jacquin, Hugo
Gilson, Amy
Shakhnovich, Eugene
Cocco, Simona
Monasson, Rémi
Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models
title Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models
title_full Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models
title_fullStr Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models
title_full_unstemmed Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models
title_short Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models
title_sort benchmarking inverse statistical approaches for protein structure and design with exactly solvable models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4866778/
https://www.ncbi.nlm.nih.gov/pubmed/27177270
http://dx.doi.org/10.1371/journal.pcbi.1004889
work_keys_str_mv AT jacquinhugo benchmarkinginversestatisticalapproachesforproteinstructureanddesignwithexactlysolvablemodels
AT gilsonamy benchmarkinginversestatisticalapproachesforproteinstructureanddesignwithexactlysolvablemodels
AT shakhnovicheugene benchmarkinginversestatisticalapproachesforproteinstructureanddesignwithexactlysolvablemodels
AT coccosimona benchmarkinginversestatisticalapproachesforproteinstructureanddesignwithexactlysolvablemodels
AT monassonremi benchmarkinginversestatisticalapproachesforproteinstructureanddesignwithexactlysolvablemodels