Cargando…

Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities

The amyloid conformation can be adopted by a variety of sequences, but the precise boundaries of amyloid sequence space are still unclear. The currently charted amyloid sequence space is strongly biased towards hydrophobic, beta-sheet prone sequences that form the core of globular proteins and by Q/...

Descripción completa

Detalles Bibliográficos
Autores principales: Louros, Nikolaos, Orlando, Gabriele, De Vleeschouwer, Matthias, Rousseau, Frederic, Schymkowitz, Joost
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7335209/
https://www.ncbi.nlm.nih.gov/pubmed/32620861
http://dx.doi.org/10.1038/s41467-020-17207-3
_version_ 1783554096466231296
author Louros, Nikolaos
Orlando, Gabriele
De Vleeschouwer, Matthias
Rousseau, Frederic
Schymkowitz, Joost
author_facet Louros, Nikolaos
Orlando, Gabriele
De Vleeschouwer, Matthias
Rousseau, Frederic
Schymkowitz, Joost
author_sort Louros, Nikolaos
collection PubMed
description The amyloid conformation can be adopted by a variety of sequences, but the precise boundaries of amyloid sequence space are still unclear. The currently charted amyloid sequence space is strongly biased towards hydrophobic, beta-sheet prone sequences that form the core of globular proteins and by Q/N/Y rich yeast prions. Here, we took advantage of the increasing amount of high-resolution structural information on amyloid cores currently available in the protein databank to implement a machine learning approach, named Cordax (https://cordax.switchlab.org), that explores amyloid sequence beyond its current boundaries. Clustering by t-Distributed Stochastic Neighbour Embedding (t-SNE) shows how our approach resulted in an expansion away from hydrophobic amyloid sequences towards clusters of lower aliphatic content and higher charge, or regions of helical and disordered propensities. These clusters uncouple amyloid propensity from solubility representing sequence flavours compatible with surface-exposed patches in globular proteins, functional amyloids or sequences associated to liquid-liquid phase transitions.
format Online
Article
Text
id pubmed-7335209
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73352092020-07-09 Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities Louros, Nikolaos Orlando, Gabriele De Vleeschouwer, Matthias Rousseau, Frederic Schymkowitz, Joost Nat Commun Article The amyloid conformation can be adopted by a variety of sequences, but the precise boundaries of amyloid sequence space are still unclear. The currently charted amyloid sequence space is strongly biased towards hydrophobic, beta-sheet prone sequences that form the core of globular proteins and by Q/N/Y rich yeast prions. Here, we took advantage of the increasing amount of high-resolution structural information on amyloid cores currently available in the protein databank to implement a machine learning approach, named Cordax (https://cordax.switchlab.org), that explores amyloid sequence beyond its current boundaries. Clustering by t-Distributed Stochastic Neighbour Embedding (t-SNE) shows how our approach resulted in an expansion away from hydrophobic amyloid sequences towards clusters of lower aliphatic content and higher charge, or regions of helical and disordered propensities. These clusters uncouple amyloid propensity from solubility representing sequence flavours compatible with surface-exposed patches in globular proteins, functional amyloids or sequences associated to liquid-liquid phase transitions. Nature Publishing Group UK 2020-07-03 /pmc/articles/PMC7335209/ /pubmed/32620861 http://dx.doi.org/10.1038/s41467-020-17207-3 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Louros, Nikolaos
Orlando, Gabriele
De Vleeschouwer, Matthias
Rousseau, Frederic
Schymkowitz, Joost
Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
title Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
title_full Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
title_fullStr Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
title_full_unstemmed Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
title_short Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
title_sort structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7335209/
https://www.ncbi.nlm.nih.gov/pubmed/32620861
http://dx.doi.org/10.1038/s41467-020-17207-3
work_keys_str_mv AT lourosnikolaos structurebasedmachineguidedmappingofamyloidsequencespacerevealsunchartedsequenceclusterswithhighersolubilities
AT orlandogabriele structurebasedmachineguidedmappingofamyloidsequencespacerevealsunchartedsequenceclusterswithhighersolubilities
AT devleeschouwermatthias structurebasedmachineguidedmappingofamyloidsequencespacerevealsunchartedsequenceclusterswithhighersolubilities
AT rousseaufrederic structurebasedmachineguidedmappingofamyloidsequencespacerevealsunchartedsequenceclusterswithhighersolubilities
AT schymkowitzjoost structurebasedmachineguidedmappingofamyloidsequencespacerevealsunchartedsequenceclusterswithhighersolubilities