Cargando…

Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families

Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possi...

Descripción completa

Detalles Bibliográficos
Autores principales: Bassot, Claudio, Elofsson, Arne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8078820/
https://www.ncbi.nlm.nih.gov/pubmed/33857128
http://dx.doi.org/10.1371/journal.pcbi.1008798
_version_ 1783685111845224448
author Bassot, Claudio
Elofsson, Arne
author_facet Bassot, Claudio
Elofsson, Arne
author_sort Bassot, Claudio
collection PubMed
description Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein’s structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy.
format Online
Article
Text
id pubmed-8078820
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80788202021-05-06 Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families Bassot, Claudio Elofsson, Arne PLoS Comput Biol Research Article Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein’s structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy. Public Library of Science 2021-04-15 /pmc/articles/PMC8078820/ /pubmed/33857128 http://dx.doi.org/10.1371/journal.pcbi.1008798 Text en © 2021 Bassot, Elofsson https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bassot, Claudio
Elofsson, Arne
Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
title Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
title_full Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
title_fullStr Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
title_full_unstemmed Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
title_short Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
title_sort accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8078820/
https://www.ncbi.nlm.nih.gov/pubmed/33857128
http://dx.doi.org/10.1371/journal.pcbi.1008798
work_keys_str_mv AT bassotclaudio accuratecontactbasedmodellingofrepeatproteinspredictsthestructureofnewrepeatsproteinfamilies
AT elofssonarne accuratecontactbasedmodellingofrepeatproteinspredictsthestructureofnewrepeatsproteinfamilies