Cargando…
ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites
Lysine SUMOylation plays an essential role in various biological functions. Several approaches integrating various algorithms have been developed for predicting SUMOylation sites based on a limited dataset. Recently, the number of identified SUMOylation sites has significantly increased due to inves...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9454673/ https://www.ncbi.nlm.nih.gov/pubmed/36078053 http://dx.doi.org/10.3390/cells11172646 |
_version_ | 1784785404871835648 |
---|---|
author | Zhu, Yafei Liu, Yuhai Chen, Yu Li, Lei |
author_facet | Zhu, Yafei Liu, Yuhai Chen, Yu Li, Lei |
author_sort | Zhu, Yafei |
collection | PubMed |
description | Lysine SUMOylation plays an essential role in various biological functions. Several approaches integrating various algorithms have been developed for predicting SUMOylation sites based on a limited dataset. Recently, the number of identified SUMOylation sites has significantly increased due to investigation at the proteomics scale. We collected modification data and found the reported approaches had poor performance using our collected data. Therefore, it is essential to explore the characteristics of this modification and construct prediction models with improved performance based on an enlarged dataset. In this study, we constructed and compared 16 classifiers by integrating four different algorithms and four encoding features selected from 11 sequence-based or physicochemical features. We found that the convolution neural network (CNN) model integrated with residue structure, dubbed ResSUMO, performed favorably when compared with the traditional machine learning and CNN models in both cross-validation and independent tests. The area under the receiver operating characteristic (ROC) curve for ResSUMO was around 0.80, superior to that of the reported predictors. We also found that increasing the depth of neural networks in the CNN models did not improve prediction performance due to the degradation problem, but the residual structure could be included to optimize the neural networks and improve performance. This indicates that residual neural networks have the potential to be broadly applied in the prediction of other types of modification sites with great effectiveness and robustness. Furthermore, the online ResSUMO service is freely accessible. |
format | Online Article Text |
id | pubmed-9454673 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-94546732022-09-09 ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites Zhu, Yafei Liu, Yuhai Chen, Yu Li, Lei Cells Article Lysine SUMOylation plays an essential role in various biological functions. Several approaches integrating various algorithms have been developed for predicting SUMOylation sites based on a limited dataset. Recently, the number of identified SUMOylation sites has significantly increased due to investigation at the proteomics scale. We collected modification data and found the reported approaches had poor performance using our collected data. Therefore, it is essential to explore the characteristics of this modification and construct prediction models with improved performance based on an enlarged dataset. In this study, we constructed and compared 16 classifiers by integrating four different algorithms and four encoding features selected from 11 sequence-based or physicochemical features. We found that the convolution neural network (CNN) model integrated with residue structure, dubbed ResSUMO, performed favorably when compared with the traditional machine learning and CNN models in both cross-validation and independent tests. The area under the receiver operating characteristic (ROC) curve for ResSUMO was around 0.80, superior to that of the reported predictors. We also found that increasing the depth of neural networks in the CNN models did not improve prediction performance due to the degradation problem, but the residual structure could be included to optimize the neural networks and improve performance. This indicates that residual neural networks have the potential to be broadly applied in the prediction of other types of modification sites with great effectiveness and robustness. Furthermore, the online ResSUMO service is freely accessible. MDPI 2022-08-25 /pmc/articles/PMC9454673/ /pubmed/36078053 http://dx.doi.org/10.3390/cells11172646 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhu, Yafei Liu, Yuhai Chen, Yu Li, Lei ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites |
title | ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites |
title_full | ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites |
title_fullStr | ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites |
title_full_unstemmed | ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites |
title_short | ResSUMO: A Deep Learning Architecture Based on Residual Structure for Prediction of Lysine SUMOylation Sites |
title_sort | ressumo: a deep learning architecture based on residual structure for prediction of lysine sumoylation sites |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9454673/ https://www.ncbi.nlm.nih.gov/pubmed/36078053 http://dx.doi.org/10.3390/cells11172646 |
work_keys_str_mv | AT zhuyafei ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites AT liuyuhai ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites AT chenyu ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites AT lilei ressumoadeeplearningarchitecturebasedonresidualstructureforpredictionoflysinesumoylationsites |