Cargando…
The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins
A protein sequence encodes its energy landscape - all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327145/ https://www.ncbi.nlm.nih.gov/pubmed/37425932 http://dx.doi.org/10.1101/2023.06.29.547063 |
_version_ | 1785069565434134528 |
---|---|
author | Nixon, Charlotte Lim, Shion A. Sternke, Matt Barrick, Doug Harms, Mike Marqusee, Susan |
author_facet | Nixon, Charlotte Lim, Shion A. Sternke, Matt Barrick, Doug Harms, Mike Marqusee, Susan |
author_sort | Nixon, Charlotte |
collection | PubMed |
description | A protein sequence encodes its energy landscape - all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors via Ancestral Sequence Reconstruction or a consensus protein containing the most common amino acid at each position. Both ancestral and consensus proteins are often more stable than their extant homologs - questioning the differences and suggesting that both approaches serve as general methods to engineer thermostability. We used the Ribonuclease H family to compare these approaches and evaluate how the evolutionary relationship of the input sequences affects the properties of the resulting consensus protein. While the overall consensus protein is structured and active, it neither shows properties of a well-folded protein nor has enhanced stability. In contrast, the consensus protein derived from a phylogenetically-restricted region is significantly more stable and cooperatively folded, suggesting that cooperativity may be encoded by different mechanisms in separate clades and lost when too many diverse clades are combined to generate a consensus protein. To explore this, we compared pairwise covariance scores using a Potts formalism as well as higher-order couplings using singular value decomposition (SVD). We find the SVD coordinates of a stable consensus sequence are close to coordinates of the analogous ancestor sequence and its descendants, whereas the unstable consensus sequences are outliers in SVD space. |
format | Online Article Text |
id | pubmed-10327145 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-103271452023-07-08 The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins Nixon, Charlotte Lim, Shion A. Sternke, Matt Barrick, Doug Harms, Mike Marqusee, Susan bioRxiv Article A protein sequence encodes its energy landscape - all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors via Ancestral Sequence Reconstruction or a consensus protein containing the most common amino acid at each position. Both ancestral and consensus proteins are often more stable than their extant homologs - questioning the differences and suggesting that both approaches serve as general methods to engineer thermostability. We used the Ribonuclease H family to compare these approaches and evaluate how the evolutionary relationship of the input sequences affects the properties of the resulting consensus protein. While the overall consensus protein is structured and active, it neither shows properties of a well-folded protein nor has enhanced stability. In contrast, the consensus protein derived from a phylogenetically-restricted region is significantly more stable and cooperatively folded, suggesting that cooperativity may be encoded by different mechanisms in separate clades and lost when too many diverse clades are combined to generate a consensus protein. To explore this, we compared pairwise covariance scores using a Potts formalism as well as higher-order couplings using singular value decomposition (SVD). We find the SVD coordinates of a stable consensus sequence are close to coordinates of the analogous ancestor sequence and its descendants, whereas the unstable consensus sequences are outliers in SVD space. Cold Spring Harbor Laboratory 2023-07-01 /pmc/articles/PMC10327145/ /pubmed/37425932 http://dx.doi.org/10.1101/2023.06.29.547063 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Nixon, Charlotte Lim, Shion A. Sternke, Matt Barrick, Doug Harms, Mike Marqusee, Susan The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
title | The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
title_full | The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
title_fullStr | The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
title_full_unstemmed | The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
title_short | The importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
title_sort | importance of input sequence set to consensus-derived proteins and their relationship to reconstructed ancestral proteins |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10327145/ https://www.ncbi.nlm.nih.gov/pubmed/37425932 http://dx.doi.org/10.1101/2023.06.29.547063 |
work_keys_str_mv | AT nixoncharlotte theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT limshiona theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT sternkematt theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT barrickdoug theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT harmsmike theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT marquseesusan theimportanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT nixoncharlotte importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT limshiona importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT sternkematt importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT barrickdoug importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT harmsmike importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins AT marquseesusan importanceofinputsequencesettoconsensusderivedproteinsandtheirrelationshiptoreconstructedancestralproteins |