Cargando…

ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites

Lysine SUMOylation plays an essential role in various biological functions. Several approaches integrating various algorithms have been developed for predicting SUMOylation sites based on a limited dataset. Recently, the number of identified SUMOylation sites has significantly increased due to inves...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Yafei, Liu, Yuhai, Chen, Yu, Li, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9454673/
https://www.ncbi.nlm.nih.gov/pubmed/36078053
http://dx.doi.org/10.3390/cells11172646
_version_ 1784785404871835648
author Zhu, Yafei
Liu, Yuhai
Chen, Yu
Li, Lei
author_facet Zhu, Yafei
Liu, Yuhai
Chen, Yu
Li, Lei
author_sort Zhu, Yafei
collection PubMed
description Lysine SUMOylation plays an essential role in various biological functions. Several approaches integrating various algorithms have been developed for predicting SUMOylation sites based on a limited dataset. Recently, the number of identified SUMOylation sites has significantly increased due to investigation at the proteomics scale. We collected modification data and found the reported approaches had poor performance using our collected data. Therefore, it is essential to explore the characteristics of this modification and construct prediction models with improved performance based on an enlarged dataset. In this study, we constructed and compared 16 classifiers by integrating four different algorithms and four encoding features selected from 11 sequence-based or physicochemical features. We found that the convolution neural network (CNN) model integrated with residue structure, dubbed ResSUMO, performed favorably when compared with the traditional machine learning and CNN models in both cross-validation and independent tests. The area under the receiver operating characteristic (ROC) curve for ResSUMO was around 0.80, superior to that of the reported predictors. We also found that increasing the depth of neural networks in the CNN models did not improve prediction performance due to the degradation problem, but the residual structure could be included to optimize the neural networks and improve performance. This indicates that residual neural networks have the potential to be broadly applied in the prediction of other types of modification sites with great effectiveness and robustness. Furthermore, the online ResSUMO service is freely accessible.
format Online
Article
Text
id pubmed-9454673
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94546732022-09-09 ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites Zhu, Yafei Liu, Yuhai Chen, Yu Li, Lei Cells Article Lysine SUMOylation plays an essential role in various biological functions. Several approaches integrating various algorithms have been developed for predicting SUMOylation sites based on a limited dataset. Recently, the number of identified SUMOylation sites has significantly increased due to investigation at the proteomics scale. We collected modification data and found the reported approaches had poor performance using our collected data. Therefore, it is essential to explore the characteristics of this modification and construct prediction models with improved performance based on an enlarged dataset. In this study, we constructed and compared 16 classifiers by integrating four different algorithms and four encoding features selected from 11 sequence-based or physicochemical features. We found that the convolution neural network (CNN) model integrated with residue structure, dubbed ResSUMO, performed favorably when compared with the traditional machine learning and CNN models in both cross-validation and independent tests. The area under the receiver operating characteristic (ROC) curve for ResSUMO was around 0.80, superior to that of the reported predictors. We also found that increasing the depth of neural networks in the CNN models did not improve prediction performance due to the degradation problem, but the residual structure could be included to optimize the neural networks and improve performance. This indicates that residual neural networks have the potential to be broadly applied in the prediction of other types of modification sites with great effectiveness and robustness. Furthermore, the online ResSUMO service is freely accessible. MDPI 2022-08-25 /pmc/articles/PMC9454673/ /pubmed/36078053 http://dx.doi.org/10.3390/cells11172646 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhu, Yafei
Liu, Yuhai
Chen, Yu
Li, Lei
ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
title ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
title_full ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
title_fullStr ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
title_full_unstemmed ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
title_short ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
title_sort ressumo: a deep learning architecture based on residual structure for prediction of lysine sumoylation sites
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9454673/
https://www.ncbi.nlm.nih.gov/pubmed/36078053
http://dx.doi.org/10.3390/cells11172646
work_keys_str_mv AT zhuyafei ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites
AT liuyuhai ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites
AT chenyu ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites
AT lilei ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites