Cargando…
A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences
As one of the most important post-translational modifications (PTMs), phosphorylation refers to the binding of a phosphate group with amino acid residues like Ser (S), Thr (T) and Tyr (Y) thus resulting in diverse functions at the molecular level. Abnormal phosphorylation has been proved to be close...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8915183/ https://www.ncbi.nlm.nih.gov/pubmed/35163663 http://dx.doi.org/10.3390/ijms23031741 |
_version_ | 1784667958542336000 |
---|---|
author | He, Jian Wu, Yanling Pu, Xuemei Li, Menglong Guo, Yanzhi |
author_facet | He, Jian Wu, Yanling Pu, Xuemei Li, Menglong Guo, Yanzhi |
author_sort | He, Jian |
collection | PubMed |
description | As one of the most important post-translational modifications (PTMs), phosphorylation refers to the binding of a phosphate group with amino acid residues like Ser (S), Thr (T) and Tyr (Y) thus resulting in diverse functions at the molecular level. Abnormal phosphorylation has been proved to be closely related with human diseases. To our knowledge, no research has been reported describing specific disease-associated phosphorylation sites prediction which is of great significance for comprehensive understanding of disease mechanism. In this work, focusing on three types of leukemia, we aim to develop a reliable leukemia-related phosphorylation site prediction models by combing deep convolutional neural network (CNN) with transfer-learning. CNN could automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of leukemia-related phosphorylation site prediction. With the largest dataset of myelogenous leukemia, the optimal models for S/T/Y phosphorylation sites give the AUC values of 0.8784, 0.8328 and 0.7716 respectively. When transferred learning on the small size datasets, the models for T-cell and lymphoid leukemia also give the promising performance by common sharing the optimal parameters. Compared with other five machine-learning methods, our CNN models reveal the superior performance. Finally, the leukemia-related pathogenesis analysis and distribution analysis on phosphorylated proteins along with K-means clustering analysis and position-specific conversation profiles on the phosphorylation site all indicate the strong practical feasibility of our easy-to-use CNN models. |
format | Online Article Text |
id | pubmed-8915183 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-89151832022-03-12 A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences He, Jian Wu, Yanling Pu, Xuemei Li, Menglong Guo, Yanzhi Int J Mol Sci Article As one of the most important post-translational modifications (PTMs), phosphorylation refers to the binding of a phosphate group with amino acid residues like Ser (S), Thr (T) and Tyr (Y) thus resulting in diverse functions at the molecular level. Abnormal phosphorylation has been proved to be closely related with human diseases. To our knowledge, no research has been reported describing specific disease-associated phosphorylation sites prediction which is of great significance for comprehensive understanding of disease mechanism. In this work, focusing on three types of leukemia, we aim to develop a reliable leukemia-related phosphorylation site prediction models by combing deep convolutional neural network (CNN) with transfer-learning. CNN could automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of leukemia-related phosphorylation site prediction. With the largest dataset of myelogenous leukemia, the optimal models for S/T/Y phosphorylation sites give the AUC values of 0.8784, 0.8328 and 0.7716 respectively. When transferred learning on the small size datasets, the models for T-cell and lymphoid leukemia also give the promising performance by common sharing the optimal parameters. Compared with other five machine-learning methods, our CNN models reveal the superior performance. Finally, the leukemia-related pathogenesis analysis and distribution analysis on phosphorylated proteins along with K-means clustering analysis and position-specific conversation profiles on the phosphorylation site all indicate the strong practical feasibility of our easy-to-use CNN models. MDPI 2022-02-03 /pmc/articles/PMC8915183/ /pubmed/35163663 http://dx.doi.org/10.3390/ijms23031741 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article He, Jian Wu, Yanling Pu, Xuemei Li, Menglong Guo, Yanzhi A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences |
title | A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences |
title_full | A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences |
title_fullStr | A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences |
title_full_unstemmed | A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences |
title_short | A Transfer-Learning-Based Deep Convolutional Neural Network for Predicting Leukemia-Related Phosphorylation Sites from Protein Primary Sequences |
title_sort | transfer-learning-based deep convolutional neural network for predicting leukemia-related phosphorylation sites from protein primary sequences |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8915183/ https://www.ncbi.nlm.nih.gov/pubmed/35163663 http://dx.doi.org/10.3390/ijms23031741 |
work_keys_str_mv | AT hejian atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT wuyanling atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT puxuemei atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT limenglong atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT guoyanzhi atransferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT hejian transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT wuyanling transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT puxuemei transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT limenglong transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences AT guoyanzhi transferlearningbaseddeepconvolutionalneuralnetworkforpredictingleukemiarelatedphosphorylationsitesfromproteinprimarysequences |