Cargando…

Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to wh...

Descripción completa

Detalles Bibliográficos
Autores principales: Savojardo, Castrense, Manfredi, Matteo, Martelli, Pier Luigi, Casadio, Rita
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817970/
https://www.ncbi.nlm.nih.gov/pubmed/33490109
http://dx.doi.org/10.3389/fmolb.2020.626363
_version_ 1783638743164387328
author Savojardo, Castrense
Manfredi, Matteo
Martelli, Pier Luigi
Casadio, Rita
author_facet Savojardo, Castrense
Manfredi, Matteo
Martelli, Pier Luigi
Casadio, Rita
author_sort Savojardo, Castrense
collection PubMed
description Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.
format Online
Article
Text
id pubmed-7817970
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-78179702021-01-22 Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences Savojardo, Castrense Manfredi, Matteo Martelli, Pier Luigi Casadio, Rita Front Mol Biosci Molecular Biosciences Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one. Frontiers Media S.A. 2021-01-07 /pmc/articles/PMC7817970/ /pubmed/33490109 http://dx.doi.org/10.3389/fmolb.2020.626363 Text en Copyright © 2021 Savojardo, Manfredi, Martelli and Casadio. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Savojardo, Castrense
Manfredi, Matteo
Martelli, Pier Luigi
Casadio, Rita
Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences
title Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences
title_full Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences
title_fullStr Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences
title_full_unstemmed Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences
title_short Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences
title_sort solvent accessibility of residues undergoing pathogenic variations in humans: from protein structures to protein sequences
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817970/
https://www.ncbi.nlm.nih.gov/pubmed/33490109
http://dx.doi.org/10.3389/fmolb.2020.626363
work_keys_str_mv AT savojardocastrense solventaccessibilityofresiduesundergoingpathogenicvariationsinhumansfromproteinstructurestoproteinsequences
AT manfredimatteo solventaccessibilityofresiduesundergoingpathogenicvariationsinhumansfromproteinstructurestoproteinsequences
AT martellipierluigi solventaccessibilityofresiduesundergoingpathogenicvariationsinhumansfromproteinstructurestoproteinsequences
AT casadiorita solventaccessibilityofresiduesundergoingpathogenicvariationsinhumansfromproteinstructurestoproteinsequences