Cargando…
Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method
Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6838336/ https://www.ncbi.nlm.nih.gov/pubmed/31700141 http://dx.doi.org/10.1038/s41598-019-52552-4 |
_version_ | 1783467201556119552 |
---|---|
author | Huang, Kai-Yao Hsu, Justin Bo-Kai Lee, Tzong-Yi |
author_facet | Huang, Kai-Yao Hsu, Justin Bo-Kai Lee, Tzong-Yi |
author_sort | Huang, Kai-Yao |
collection | PubMed |
description | Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. However, most of these tools predict succinylation sites based on traditional machine learning methods. Hence, this work aimed to carry out the succinylation site prediction based on a deep learning model. The abundance of MS-verified succinylated peptides enabled the investigation of substrate site specificity of succinylation sites through sequence-based attributes, such as position-specific amino acid composition, the composition of k-spaced amino acid pairs (CKSAAP), and position-specific scoring matrix (PSSM). Additionally, the maximal dependence decomposition (MDD) was adopted to detect the substrate signatures of lysine succinylation sites by dividing all succinylated sequences into several groups with conserved substrate motifs. According to the results of ten-fold cross-validation, the deep learning model trained using PSSM and informative CKSAAP attributes can reach the best predictive performance and also perform better than traditional machine-learning methods. Moreover, an independent testing dataset that truly did not exist in the training dataset was used to compare the proposed method with six existing prediction tools. The testing dataset comprised of 218 positive and 2621 negative instances, and the proposed model could yield a promising performance with 84.40% sensitivity, 86.99% specificity, 86.79% accuracy, and an MCC value of 0.489. Finally, the proposed method has been implemented as a web-based prediction tool (CNN-SuccSite), which is now freely accessible at http://csb.cse.yzu.edu.tw/CNN-SuccSite/. |
format | Online Article Text |
id | pubmed-6838336 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-68383362019-11-14 Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method Huang, Kai-Yao Hsu, Justin Bo-Kai Lee, Tzong-Yi Sci Rep Article Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. However, most of these tools predict succinylation sites based on traditional machine learning methods. Hence, this work aimed to carry out the succinylation site prediction based on a deep learning model. The abundance of MS-verified succinylated peptides enabled the investigation of substrate site specificity of succinylation sites through sequence-based attributes, such as position-specific amino acid composition, the composition of k-spaced amino acid pairs (CKSAAP), and position-specific scoring matrix (PSSM). Additionally, the maximal dependence decomposition (MDD) was adopted to detect the substrate signatures of lysine succinylation sites by dividing all succinylated sequences into several groups with conserved substrate motifs. According to the results of ten-fold cross-validation, the deep learning model trained using PSSM and informative CKSAAP attributes can reach the best predictive performance and also perform better than traditional machine-learning methods. Moreover, an independent testing dataset that truly did not exist in the training dataset was used to compare the proposed method with six existing prediction tools. The testing dataset comprised of 218 positive and 2621 negative instances, and the proposed model could yield a promising performance with 84.40% sensitivity, 86.99% specificity, 86.79% accuracy, and an MCC value of 0.489. Finally, the proposed method has been implemented as a web-based prediction tool (CNN-SuccSite), which is now freely accessible at http://csb.cse.yzu.edu.tw/CNN-SuccSite/. Nature Publishing Group UK 2019-11-07 /pmc/articles/PMC6838336/ /pubmed/31700141 http://dx.doi.org/10.1038/s41598-019-52552-4 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Huang, Kai-Yao Hsu, Justin Bo-Kai Lee, Tzong-Yi Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method |
title | Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method |
title_full | Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method |
title_fullStr | Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method |
title_full_unstemmed | Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method |
title_short | Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method |
title_sort | characterization and identification of lysine succinylation sites based on deep learning method |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6838336/ https://www.ncbi.nlm.nih.gov/pubmed/31700141 http://dx.doi.org/10.1038/s41598-019-52552-4 |
work_keys_str_mv | AT huangkaiyao characterizationandidentificationoflysinesuccinylationsitesbasedondeeplearningmethod AT hsujustinbokai characterizationandidentificationoflysinesuccinylationsitesbasedondeeplearningmethod AT leetzongyi characterizationandidentificationoflysinesuccinylationsitesbasedondeeplearningmethod |