Cargando…

A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences

As one of the most important post-translational modifications (PTMs), phosphorylation refers to the binding of a phosphate group with amino acid residues like Ser (S), Thr (T) and Tyr (Y) thus resulting in diverse functions at the molecular level. Abnormal phosphorylation has been proved to be close...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Jian, Wu, Yanling, Pu, Xuemei, Li, Menglong, Guo, Yanzhi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8915183/
https://www.ncbi.nlm.nih.gov/pubmed/35163663
http://dx.doi.org/10.3390/ijms23031741
_version_ 1784667958542336000
author He, Jian
Wu, Yanling
Pu, Xuemei
Li, Menglong
Guo, Yanzhi
author_facet He, Jian
Wu, Yanling
Pu, Xuemei
Li, Menglong
Guo, Yanzhi
author_sort He, Jian
collection PubMed
description As one of the most important post-translational modifications (PTMs), phosphorylation refers to the binding of a phosphate group with amino acid residues like Ser (S), Thr (T) and Tyr (Y) thus resulting in diverse functions at the molecular level. Abnormal phosphorylation has been proved to be closely related with human diseases. To our knowledge, no research has been reported describing specific disease-associated phosphorylation sites prediction which is of great significance for comprehensive understanding of disease mechanism. In this work, focusing on three types of leukemia, we aim to develop a reliable leukemia-related phosphorylation site prediction models by combing deep convolutional neural network (CNN) with transfer-learning. CNN could automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of leukemia-related phosphorylation site prediction. With the largest dataset of myelogenous leukemia, the optimal models for S/T/Y phosphorylation sites give the AUC values of 0.8784, 0.8328 and 0.7716 respectively. When transferred learning on the small size datasets, the models for T-cell and lymphoid leukemia also give the promising performance by common sharing the optimal parameters. Compared with other five machine-learning methods, our CNN models reveal the superior performance. Finally, the leukemia-related pathogenesis analysis and distribution analysis on phosphorylated proteins along with K-means clustering analysis and position-specific conversation profiles on the phosphorylation site all indicate the strong practical feasibility of our easy-to-use CNN models.
format Online
Article
Text
id pubmed-8915183
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89151832022-03-12 A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences He, Jian Wu, Yanling Pu, Xuemei Li, Menglong Guo, Yanzhi Int J Mol Sci Article As one of the most important post-translational modifications (PTMs), phosphorylation refers to the binding of a phosphate group with amino acid residues like Ser (S), Thr (T) and Tyr (Y) thus resulting in diverse functions at the molecular level. Abnormal phosphorylation has been proved to be closely related with human diseases. To our knowledge, no research has been reported describing specific disease-associated phosphorylation sites prediction which is of great significance for comprehensive understanding of disease mechanism. In this work, focusing on three types of leukemia, we aim to develop a reliable leukemia-related phosphorylation site prediction models by combing deep convolutional neural network (CNN) with transfer-learning. CNN could automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of leukemia-related phosphorylation site prediction. With the largest dataset of myelogenous leukemia, the optimal models for S/T/Y phosphorylation sites give the AUC values of 0.8784, 0.8328 and 0.7716 respectively. When transferred learning on the small size datasets, the models for T-cell and lymphoid leukemia also give the promising performance by common sharing the optimal parameters. Compared with other five machine-learning methods, our CNN models reveal the superior performance. Finally, the leukemia-related pathogenesis analysis and distribution analysis on phosphorylated proteins along with K-means clustering analysis and position-specific conversation profiles on the phosphorylation site all indicate the strong practical feasibility of our easy-to-use CNN models. MDPI 2022-02-03 /pmc/articles/PMC8915183/ /pubmed/35163663 http://dx.doi.org/10.3390/ijms23031741 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
He, Jian
Wu, Yanling
Pu, Xuemei
Li, Menglong
Guo, Yanzhi
A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
title A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
title_full A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
title_fullStr A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
title_full_unstemmed A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
title_short A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
title_sort transfer-learning-based deep convolutional neural network for predicting leukemia-related phosphorylation sites from protein primary sequences
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8915183/
https://www.ncbi.nlm.nih.gov/pubmed/35163663
http://dx.doi.org/10.3390/ijms23031741
work_keys_str_mv AT hejian atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT wuyanling atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT puxuemei atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT limenglong atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT guoyanzhi atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT hejian transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT wuyanling transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT puxuemei transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT limenglong transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences
AT guoyanzhi transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences