A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites

Protein hydroxylation is one type of post-translational modifications (PTMs) playing critical roles in human diseases. It is known that protein sequence contains many uncharacterized residues of proline and lysine. The question that needs to be answered is: which residue can be hydroxylated, and whi...

Descripción completa

Detalles Bibliográficos
Autores principales: Long, Haixia, Liao, Bo, Xu, Xingyu, Yang, Jialiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6164125/
https://www.ncbi.nlm.nih.gov/pubmed/30231550
http://dx.doi.org/10.3390/ijms19092817
_version_ 1783359525713084416
author Long, Haixia
Liao, Bo
Xu, Xingyu
Yang, Jialiang
author_facet Long, Haixia
Liao, Bo
Xu, Xingyu
Yang, Jialiang
author_sort Long, Haixia
collection PubMed
description Protein hydroxylation is one type of post-translational modifications (PTMs) playing critical roles in human diseases. It is known that protein sequence contains many uncharacterized residues of proline and lysine. The question that needs to be answered is: which residue can be hydroxylated, and which one cannot. The answer will not only help understand the mechanism of hydroxylation but can also benefit the development of new drugs. In this paper, we proposed a novel approach for predicting hydroxylation using a hybrid deep learning model integrating the convolutional neural network (CNN) and long short-term memory network (LSTM). We employed a pseudo amino acid composition (PseAAC) method to construct valid benchmark datasets based on a sliding window strategy and used the position-specific scoring matrix (PSSM) to represent samples as inputs to the deep learning model. In addition, we compared our method with popular predictors including CNN, iHyd-PseAAC, and iHyd-PseCp. The results for 5-fold cross-validations all demonstrated that our method significantly outperforms the other methods in prediction accuracy.
format Online
Article
Text
id pubmed-6164125
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-61641252018-10-10 A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites Long, Haixia Liao, Bo Xu, Xingyu Yang, Jialiang Int J Mol Sci Article Protein hydroxylation is one type of post-translational modifications (PTMs) playing critical roles in human diseases. It is known that protein sequence contains many uncharacterized residues of proline and lysine. The question that needs to be answered is: which residue can be hydroxylated, and which one cannot. The answer will not only help understand the mechanism of hydroxylation but can also benefit the development of new drugs. In this paper, we proposed a novel approach for predicting hydroxylation using a hybrid deep learning model integrating the convolutional neural network (CNN) and long short-term memory network (LSTM). We employed a pseudo amino acid composition (PseAAC) method to construct valid benchmark datasets based on a sliding window strategy and used the position-specific scoring matrix (PSSM) to represent samples as inputs to the deep learning model. In addition, we compared our method with popular predictors including CNN, iHyd-PseAAC, and iHyd-PseCp. The results for 5-fold cross-validations all demonstrated that our method significantly outperforms the other methods in prediction accuracy. MDPI 2018-09-18 /pmc/articles/PMC6164125/ /pubmed/30231550 http://dx.doi.org/10.3390/ijms19092817 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Long, Haixia
Liao, Bo
Xu, Xingyu
Yang, Jialiang
A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites
title A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites
title_full A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites
title_fullStr A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites
title_full_unstemmed A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites
title_short A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites
title_sort hybrid deep learning model for predicting protein hydroxylation sites
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6164125/
https://www.ncbi.nlm.nih.gov/pubmed/30231550
http://dx.doi.org/10.3390/ijms19092817
work_keys_str_mv AT longhaixia ahybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT liaobo ahybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT xuxingyu ahybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT yangjialiang ahybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT longhaixia hybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT liaobo hybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT xuxingyu hybriddeeplearningmodelforpredictingproteinhydroxylationsites
AT yangjialiang hybriddeeplearningmodelforpredictingproteinhydroxylationsites