Cargando…

Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs

Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the abse...

Descripción completa

Detalles Bibliográficos
Autores principales: Sen, Neeladri, Anishchenko, Ivan, Bordin, Nicola, Sillitoe, Ian, Velankar, Sameer, Baker, David, Orengo, Christine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9294430/
https://www.ncbi.nlm.nih.gov/pubmed/35641150
http://dx.doi.org/10.1093/bib/bbac187
_version_ 1784749851773239296
author Sen, Neeladri
Anishchenko, Ivan
Bordin, Nicola
Sillitoe, Ian
Velankar, Sameer
Baker, David
Orengo, Christine
author_facet Sen, Neeladri
Anishchenko, Ivan
Bordin, Nicola
Sillitoe, Ian
Velankar, Sameer
Baker, David
Orengo, Christine
author_sort Sen, Neeladri
collection PubMed
description Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein–protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.
format Online
Article
Text
id pubmed-9294430
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92944302022-07-20 Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs Sen, Neeladri Anishchenko, Ivan Bordin, Nicola Sillitoe, Ian Velankar, Sameer Baker, David Orengo, Christine Brief Bioinform Problem Solving Protocol Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein–protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models. Oxford University Press 2022-06-01 /pmc/articles/PMC9294430/ /pubmed/35641150 http://dx.doi.org/10.1093/bib/bbac187 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Sen, Neeladri
Anishchenko, Ivan
Bordin, Nicola
Sillitoe, Ian
Velankar, Sameer
Baker, David
Orengo, Christine
Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
title Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
title_full Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
title_fullStr Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
title_full_unstemmed Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
title_short Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
title_sort characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9294430/
https://www.ncbi.nlm.nih.gov/pubmed/35641150
http://dx.doi.org/10.1093/bib/bbac187
work_keys_str_mv AT senneeladri characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs
AT anishchenkoivan characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs
AT bordinnicola characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs
AT sillitoeian characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs
AT velankarsameer characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs
AT bakerdavid characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs
AT orengochristine characterizingandexplainingtheimpactofdiseaseassociatedmutationsinproteinswithoutknownstructuresorstructuralhomologs