Cargando…

Insight into the protein solubility driving forces with neural attention

Protein solubility is a key aspect for many biotechnological, biomedical and industrial processes, such as the production of active proteins and antibodies. In addition, understanding the molecular determinants of the solubility of proteins may be crucial to shed light on the molecular mechanisms of...

Descripción completa

Detalles Bibliográficos
Autores principales: Raimondi, Daniele, Orlando, Gabriele, Fariselli, Piero, Moreau, Yves
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7217484/
https://www.ncbi.nlm.nih.gov/pubmed/32352965
http://dx.doi.org/10.1371/journal.pcbi.1007722
_version_ 1783532608428179456
author Raimondi, Daniele
Orlando, Gabriele
Fariselli, Piero
Moreau, Yves
author_facet Raimondi, Daniele
Orlando, Gabriele
Fariselli, Piero
Moreau, Yves
author_sort Raimondi, Daniele
collection PubMed
description Protein solubility is a key aspect for many biotechnological, biomedical and industrial processes, such as the production of active proteins and antibodies. In addition, understanding the molecular determinants of the solubility of proteins may be crucial to shed light on the molecular mechanisms of diseases caused by aggregation processes such as amyloidosis. Here we present SKADE, a novel Neural Network protein solubility predictor and we show how it can provide novel insight into the protein solubility mechanisms, thanks to its neural attention architecture. First, we show that SKADE positively compares with state of the art tools while using just the protein sequence as input. Then, thanks to the neural attention mechanism, we use SKADE to investigate the patterns learned during training and we analyse its decision process. We use this peculiarity to show that, while the attention profiles do not correlate with obvious sequence aspects such as biophysical properties of the aminoacids, they suggest that N- and C-termini are the most relevant regions for solubility prediction and are predictive for complex emergent properties such as aggregation-prone regions involved in beta-amyloidosis and contact density. Moreover, SKADE is able to identify mutations that increase or decrease the overall solubility of the protein, allowing it to be used to perform large scale in-silico mutagenesis of proteins in order to maximize their solubility.
format Online
Article
Text
id pubmed-7217484
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-72174842020-05-29 Insight into the protein solubility driving forces with neural attention Raimondi, Daniele Orlando, Gabriele Fariselli, Piero Moreau, Yves PLoS Comput Biol Research Article Protein solubility is a key aspect for many biotechnological, biomedical and industrial processes, such as the production of active proteins and antibodies. In addition, understanding the molecular determinants of the solubility of proteins may be crucial to shed light on the molecular mechanisms of diseases caused by aggregation processes such as amyloidosis. Here we present SKADE, a novel Neural Network protein solubility predictor and we show how it can provide novel insight into the protein solubility mechanisms, thanks to its neural attention architecture. First, we show that SKADE positively compares with state of the art tools while using just the protein sequence as input. Then, thanks to the neural attention mechanism, we use SKADE to investigate the patterns learned during training and we analyse its decision process. We use this peculiarity to show that, while the attention profiles do not correlate with obvious sequence aspects such as biophysical properties of the aminoacids, they suggest that N- and C-termini are the most relevant regions for solubility prediction and are predictive for complex emergent properties such as aggregation-prone regions involved in beta-amyloidosis and contact density. Moreover, SKADE is able to identify mutations that increase or decrease the overall solubility of the protein, allowing it to be used to perform large scale in-silico mutagenesis of proteins in order to maximize their solubility. Public Library of Science 2020-04-30 /pmc/articles/PMC7217484/ /pubmed/32352965 http://dx.doi.org/10.1371/journal.pcbi.1007722 Text en © 2020 Raimondi et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Raimondi, Daniele
Orlando, Gabriele
Fariselli, Piero
Moreau, Yves
Insight into the protein solubility driving forces with neural attention
title Insight into the protein solubility driving forces with neural attention
title_full Insight into the protein solubility driving forces with neural attention
title_fullStr Insight into the protein solubility driving forces with neural attention
title_full_unstemmed Insight into the protein solubility driving forces with neural attention
title_short Insight into the protein solubility driving forces with neural attention
title_sort insight into the protein solubility driving forces with neural attention
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7217484/
https://www.ncbi.nlm.nih.gov/pubmed/32352965
http://dx.doi.org/10.1371/journal.pcbi.1007722
work_keys_str_mv AT raimondidaniele insightintotheproteinsolubilitydrivingforceswithneuralattention
AT orlandogabriele insightintotheproteinsolubilitydrivingforceswithneuralattention
AT farisellipiero insightintotheproteinsolubilitydrivingforceswithneuralattention
AT moreauyves insightintotheproteinsolubilitydrivingforceswithneuralattention