Cargando…

Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network

Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibili...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Buzhong, Li, Linqing, Lü, Qiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023031/
https://www.ncbi.nlm.nih.gov/pubmed/29799510
http://dx.doi.org/10.3390/biom8020033
_version_ 1783335779326492672
author Zhang, Buzhong
Li, Linqing
Lü, Qiang
author_facet Zhang, Buzhong
Li, Linqing
Lü, Qiang
author_sort Zhang, Buzhong
collection PubMed
description Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson’s correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset.
format Online
Article
Text
id pubmed-6023031
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-60230312018-07-02 Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network Zhang, Buzhong Li, Linqing Lü, Qiang Biomolecules Article Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson’s correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset. MDPI 2018-05-25 /pmc/articles/PMC6023031/ /pubmed/29799510 http://dx.doi.org/10.3390/biom8020033 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Buzhong
Li, Linqing
Lü, Qiang
Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
title Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
title_full Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
title_fullStr Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
title_full_unstemmed Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
title_short Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network
title_sort protein solvent-accessibility prediction by a stacked deep bidirectional recurrent neural network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023031/
https://www.ncbi.nlm.nih.gov/pubmed/29799510
http://dx.doi.org/10.3390/biom8020033
work_keys_str_mv AT zhangbuzhong proteinsolventaccessibilitypredictionbyastackeddeepbidirectionalrecurrentneuralnetwork
AT lilinqing proteinsolventaccessibilitypredictionbyastackeddeepbidirectionalrecurrentneuralnetwork
AT luqiang proteinsolventaccessibilitypredictionbyastackeddeepbidirectionalrecurrentneuralnetwork