Cargando…
Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibili...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023031/ https://www.ncbi.nlm.nih.gov/pubmed/29799510 http://dx.doi.org/10.3390/biom8020033 |
_version_ | 1783335779326492672 |
---|---|
author | Zhang, Buzhong Li, Linqing Lü, Qiang |
author_facet | Zhang, Buzhong Li, Linqing Lü, Qiang |
author_sort | Zhang, Buzhong |
collection | PubMed |
description | Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson’s correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset. |
format | Online Article Text |
id | pubmed-6023031 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-60230312018-07-02 Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network Zhang, Buzhong Li, Linqing Lü, Qiang Biomolecules Article Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson’s correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset. MDPI 2018-05-25 /pmc/articles/PMC6023031/ /pubmed/29799510 http://dx.doi.org/10.3390/biom8020033 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhang, Buzhong Li, Linqing Lü, Qiang Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network |
title | Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network |
title_full | Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network |
title_fullStr | Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network |
title_full_unstemmed | Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network |
title_short | Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network |
title_sort | protein solvent-accessibility prediction by a stacked deep bidirectional recurrent neural network |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023031/ https://www.ncbi.nlm.nih.gov/pubmed/29799510 http://dx.doi.org/10.3390/biom8020033 |
work_keys_str_mv | AT zhangbuzhong proteinsolventaccessibilitypredictionbyastackeddeepbidirectionalrecurrentneuralnetwork AT lilinqing proteinsolventaccessibilitypredictionbyastackeddeepbidirectionalrecurrentneuralnetwork AT luqiang proteinsolventaccessibilitypredictionbyastackeddeepbidirectionalrecurrentneuralnetwork |