Cargando…

New amino acid substitution matrix brings sequence alignments into agreement with structure matches

Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similariti...

Descripción completa

Detalles Bibliográficos
Autores principales: Jia, Kejue, Jernigan, Robert L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641535/
https://www.ncbi.nlm.nih.gov/pubmed/33469973
http://dx.doi.org/10.1002/prot.26050
_version_ 1784609515756322816
author Jia, Kejue
Jernigan, Robert L
author_facet Jia, Kejue
Jernigan, Robert L
author_sort Jia, Kejue
collection PubMed
description Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similarities. At present, sequence matching compares sequences based only upon the similarities of single amino acids, ignoring the fact that in densely packed protein, there are additional conservative substitutions representing exchanges between two interacting amino acids, such as a small-large pair changing to a large-small pair substitutions that are not individually so conservative. Here we show that including information for such pairs of substitutions yields improved sequence matches, and that these yield significant gains in the agreements between sequence alignments and structure matches of the same protein pair. The result shows sequence segments matched where structure segments are aligned. There are gains for all 2002 collected cases where the sequence alignments that were not previously congruent with the structure matches. Our results also demonstrate a significant gain in detecting homology for “twilight zone” protein sequences. The amino acid substitution metrics derived have many other potential applications, for annotations, protein design, mutagenesis design, and empirical potential derivation.
format Online
Article
Text
id pubmed-8641535
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-86415352022-06-01 New amino acid substitution matrix brings sequence alignments into agreement with structure matches Jia, Kejue Jernigan, Robert L Proteins Article Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similarities. At present, sequence matching compares sequences based only upon the similarities of single amino acids, ignoring the fact that in densely packed protein, there are additional conservative substitutions representing exchanges between two interacting amino acids, such as a small-large pair changing to a large-small pair substitutions that are not individually so conservative. Here we show that including information for such pairs of substitutions yields improved sequence matches, and that these yield significant gains in the agreements between sequence alignments and structure matches of the same protein pair. The result shows sequence segments matched where structure segments are aligned. There are gains for all 2002 collected cases where the sequence alignments that were not previously congruent with the structure matches. Our results also demonstrate a significant gain in detecting homology for “twilight zone” protein sequences. The amino acid substitution metrics derived have many other potential applications, for annotations, protein design, mutagenesis design, and empirical potential derivation. 2021-02-02 2021-06 /pmc/articles/PMC8641535/ /pubmed/33469973 http://dx.doi.org/10.1002/prot.26050 Text en https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Article
Jia, Kejue
Jernigan, Robert L
New amino acid substitution matrix brings sequence alignments into agreement with structure matches
title New amino acid substitution matrix brings sequence alignments into agreement with structure matches
title_full New amino acid substitution matrix brings sequence alignments into agreement with structure matches
title_fullStr New amino acid substitution matrix brings sequence alignments into agreement with structure matches
title_full_unstemmed New amino acid substitution matrix brings sequence alignments into agreement with structure matches
title_short New amino acid substitution matrix brings sequence alignments into agreement with structure matches
title_sort new amino acid substitution matrix brings sequence alignments into agreement with structure matches
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641535/
https://www.ncbi.nlm.nih.gov/pubmed/33469973
http://dx.doi.org/10.1002/prot.26050
work_keys_str_mv AT jiakejue newaminoacidsubstitutionmatrixbringssequencealignmentsintoagreementwithstructurematches
AT jerniganrobertl newaminoacidsubstitutionmatrixbringssequencealignmentsintoagreementwithstructurematches