Cargando…

The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins

A protein sequence encodes its energy landscape - all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors...

Descripción completa

Detalles Bibliográficos
Autores principales: Nixon, Charlotte, Lim, Shion A., Sternke, Matt, Barrick, Doug, Harms, Mike, Marqusee, Susan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327145/
https://www.ncbi.nlm.nih.gov/pubmed/37425932
http://dx.doi.org/10.1101/2023.06.29.547063
_version_ 1785069565434134528
author Nixon, Charlotte
Lim, Shion A.
Sternke, Matt
Barrick, Doug
Harms, Mike
Marqusee, Susan
author_facet Nixon, Charlotte
Lim, Shion A.
Sternke, Matt
Barrick, Doug
Harms, Mike
Marqusee, Susan
author_sort Nixon, Charlotte
collection PubMed
description A protein sequence encodes its energy landscape - all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors via Ancestral Sequence Reconstruction or a consensus protein containing the most common amino acid at each position. Both ancestral and consensus proteins are often more stable than their extant homologs - questioning the differences and suggesting that both approaches serve as general methods to engineer thermostability. We used the Ribonuclease H family to compare these approaches and evaluate how the evolutionary relationship of the input sequences affects the properties of the resulting consensus protein. While the overall consensus protein is structured and active, it neither shows properties of a well-folded protein nor has enhanced stability. In contrast, the consensus protein derived from a phylogenetically-restricted region is significantly more stable and cooperatively folded, suggesting that cooperativity may be encoded by different mechanisms in separate clades and lost when too many diverse clades are combined to generate a consensus protein. To explore this, we compared pairwise covariance scores using a Potts formalism as well as higher-order couplings using singular value decomposition (SVD). We find the SVD coordinates of a stable consensus sequence are close to coordinates of the analogous ancestor sequence and its descendants, whereas the unstable consensus sequences are outliers in SVD space.
format Online
Article
Text
id pubmed-10327145
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-103271452023-07-08 The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins Nixon, Charlotte Lim, Shion A. Sternke, Matt Barrick, Doug Harms, Mike Marqusee, Susan bioRxiv Article A protein sequence encodes its energy landscape - all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors via Ancestral Sequence Reconstruction or a consensus protein containing the most common amino acid at each position. Both ancestral and consensus proteins are often more stable than their extant homologs - questioning the differences and suggesting that both approaches serve as general methods to engineer thermostability. We used the Ribonuclease H family to compare these approaches and evaluate how the evolutionary relationship of the input sequences affects the properties of the resulting consensus protein. While the overall consensus protein is structured and active, it neither shows properties of a well-folded protein nor has enhanced stability. In contrast, the consensus protein derived from a phylogenetically-restricted region is significantly more stable and cooperatively folded, suggesting that cooperativity may be encoded by different mechanisms in separate clades and lost when too many diverse clades are combined to generate a consensus protein. To explore this, we compared pairwise covariance scores using a Potts formalism as well as higher-order couplings using singular value decomposition (SVD). We find the SVD coordinates of a stable consensus sequence are close to coordinates of the analogous ancestor sequence and its descendants, whereas the unstable consensus sequences are outliers in SVD space. Cold Spring Harbor Laboratory 2023-07-01 /pmc/articles/PMC10327145/ /pubmed/37425932 http://dx.doi.org/10.1101/2023.06.29.547063 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Nixon, Charlotte
Lim, Shion A.
Sternke, Matt
Barrick, Doug
Harms, Mike
Marqusee, Susan
The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
title The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
title_full The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
title_fullStr The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
title_full_unstemmed The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
title_short The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
title_sort importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327145/
https://www.ncbi.nlm.nih.gov/pubmed/37425932
http://dx.doi.org/10.1101/2023.06.29.547063
work_keys_str_mv AT nixoncharlotte theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT limshiona theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT sternkematt theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT barrickdoug theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT harmsmike theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT marquseesusan theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT nixoncharlotte importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT limshiona importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT sternkematt importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT barrickdoug importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT harmsmike importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins
AT marquseesusan importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins