Cargando…
Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8078820/ https://www.ncbi.nlm.nih.gov/pubmed/33857128 http://dx.doi.org/10.1371/journal.pcbi.1008798 |
_version_ | 1783685111845224448 |
---|---|
author | Bassot, Claudio Elofsson, Arne |
author_facet | Bassot, Claudio Elofsson, Arne |
author_sort | Bassot, Claudio |
collection | PubMed |
description | Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein’s structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy. |
format | Online Article Text |
id | pubmed-8078820 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-80788202021-05-06 Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families Bassot, Claudio Elofsson, Arne PLoS Comput Biol Research Article Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein’s structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy. Public Library of Science 2021-04-15 /pmc/articles/PMC8078820/ /pubmed/33857128 http://dx.doi.org/10.1371/journal.pcbi.1008798 Text en © 2021 Bassot, Elofsson https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Bassot, Claudio Elofsson, Arne Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
title | Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
title_full | Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
title_fullStr | Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
title_full_unstemmed | Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
title_short | Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
title_sort | accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8078820/ https://www.ncbi.nlm.nih.gov/pubmed/33857128 http://dx.doi.org/10.1371/journal.pcbi.1008798 |
work_keys_str_mv | AT bassotclaudio accuratecontactbasedmodellingofrepeatproteinspredictsthestructureofnewrepeatsproteinfamilies AT elofssonarne accuratecontactbasedmodellingofrepeatproteinspredictsthestructureofnewrepeatsproteinfamilies |