Cargando…
New amino acid substitution matrix brings sequence alignments into agreement with structure matches
Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similariti...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641535/ https://www.ncbi.nlm.nih.gov/pubmed/33469973 http://dx.doi.org/10.1002/prot.26050 |
_version_ | 1784609515756322816 |
---|---|
author | Jia, Kejue Jernigan, Robert L |
author_facet | Jia, Kejue Jernigan, Robert L |
author_sort | Jia, Kejue |
collection | PubMed |
description | Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similarities. At present, sequence matching compares sequences based only upon the similarities of single amino acids, ignoring the fact that in densely packed protein, there are additional conservative substitutions representing exchanges between two interacting amino acids, such as a small-large pair changing to a large-small pair substitutions that are not individually so conservative. Here we show that including information for such pairs of substitutions yields improved sequence matches, and that these yield significant gains in the agreements between sequence alignments and structure matches of the same protein pair. The result shows sequence segments matched where structure segments are aligned. There are gains for all 2002 collected cases where the sequence alignments that were not previously congruent with the structure matches. Our results also demonstrate a significant gain in detecting homology for “twilight zone” protein sequences. The amino acid substitution metrics derived have many other potential applications, for annotations, protein design, mutagenesis design, and empirical potential derivation. |
format | Online Article Text |
id | pubmed-8641535 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-86415352022-06-01 New amino acid substitution matrix brings sequence alignments into agreement with structure matches Jia, Kejue Jernigan, Robert L Proteins Article Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similarities. At present, sequence matching compares sequences based only upon the similarities of single amino acids, ignoring the fact that in densely packed protein, there are additional conservative substitutions representing exchanges between two interacting amino acids, such as a small-large pair changing to a large-small pair substitutions that are not individually so conservative. Here we show that including information for such pairs of substitutions yields improved sequence matches, and that these yield significant gains in the agreements between sequence alignments and structure matches of the same protein pair. The result shows sequence segments matched where structure segments are aligned. There are gains for all 2002 collected cases where the sequence alignments that were not previously congruent with the structure matches. Our results also demonstrate a significant gain in detecting homology for “twilight zone” protein sequences. The amino acid substitution metrics derived have many other potential applications, for annotations, protein design, mutagenesis design, and empirical potential derivation. 2021-02-02 2021-06 /pmc/articles/PMC8641535/ /pubmed/33469973 http://dx.doi.org/10.1002/prot.26050 Text en https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Article Jia, Kejue Jernigan, Robert L New amino acid substitution matrix brings sequence alignments into agreement with structure matches |
title | New amino acid substitution matrix brings sequence alignments into agreement with structure matches |
title_full | New amino acid substitution matrix brings sequence alignments into agreement with structure matches |
title_fullStr | New amino acid substitution matrix brings sequence alignments into agreement with structure matches |
title_full_unstemmed | New amino acid substitution matrix brings sequence alignments into agreement with structure matches |
title_short | New amino acid substitution matrix brings sequence alignments into agreement with structure matches |
title_sort | new amino acid substitution matrix brings sequence alignments into agreement with structure matches |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641535/ https://www.ncbi.nlm.nih.gov/pubmed/33469973 http://dx.doi.org/10.1002/prot.26050 |
work_keys_str_mv | AT jiakejue newaminoacidsubstitutionmatrixbringssequencealignmentsintoagreementwithstructurematches AT jerniganrobertl newaminoacidsubstitutionmatrixbringssequencealignmentsintoagreementwithstructurematches |