Cargando…

SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites

Protein succinylation is a biochemical reaction in which a succinyl group (-CO-CH2-CH2-CO-) is attached to the lysine residue of a protein molecule. Lysine succinylation plays important regulatory roles in living cells. However, studies in this field are limited by the difficulty in experimentally i...

Descripción completa

Detalles Bibliográficos
Autores principales: Kao, Hui-Ju, Nguyen, Van-Nui, Huang, Kai-Yao, Chang, Wen-Chi, Lee, Tzong-Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7647693/
https://www.ncbi.nlm.nih.gov/pubmed/32592791
http://dx.doi.org/10.1016/j.gpb.2018.10.010
_version_ 1783606959717482496
author Kao, Hui-Ju
Nguyen, Van-Nui
Huang, Kai-Yao
Chang, Wen-Chi
Lee, Tzong-Yi
author_facet Kao, Hui-Ju
Nguyen, Van-Nui
Huang, Kai-Yao
Chang, Wen-Chi
Lee, Tzong-Yi
author_sort Kao, Hui-Ju
collection PubMed
description Protein succinylation is a biochemical reaction in which a succinyl group (-CO-CH2-CH2-CO-) is attached to the lysine residue of a protein molecule. Lysine succinylation plays important regulatory roles in living cells. However, studies in this field are limited by the difficulty in experimentally identifying the substrate site specificity of lysine succinylation. To facilitate this process, several tools have been proposed for the computational identification of succinylated lysine sites. In this study, we developed an approach to investigate the substrate specificity of lysine succinylated sites based on amino acid composition. Using experimentally verified lysine succinylated sites collected from public resources, the significant differences in position-specific amino acid composition between succinylated and non-succinylated sites were represented using the Two Sample Logo program. These findings enabled the adoption of an effective machine learning method, support vector machine, to train a predictive model with not only the amino acid composition, but also the composition of k-spaced amino acid pairs. After the selection of the best model using a ten-fold cross-validation approach, the selected model significantly outperformed existing tools based on an independent dataset manually extracted from published research articles. Finally, the selected model was used to develop a web-based tool, SuccSite, to aid the study of protein succinylation. Two proteins were used as case studies on the website to demonstrate the effective prediction of succinylation sites. We will regularly update SuccSite by integrating more experimental datasets. SuccSite is freely accessible at http://csb.cse.yzu.edu.tw/SuccSite/.
format Online
Article
Text
id pubmed-7647693
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-76476932020-11-13 SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites Kao, Hui-Ju Nguyen, Van-Nui Huang, Kai-Yao Chang, Wen-Chi Lee, Tzong-Yi Genomics Proteomics Bioinformatics Method Protein succinylation is a biochemical reaction in which a succinyl group (-CO-CH2-CH2-CO-) is attached to the lysine residue of a protein molecule. Lysine succinylation plays important regulatory roles in living cells. However, studies in this field are limited by the difficulty in experimentally identifying the substrate site specificity of lysine succinylation. To facilitate this process, several tools have been proposed for the computational identification of succinylated lysine sites. In this study, we developed an approach to investigate the substrate specificity of lysine succinylated sites based on amino acid composition. Using experimentally verified lysine succinylated sites collected from public resources, the significant differences in position-specific amino acid composition between succinylated and non-succinylated sites were represented using the Two Sample Logo program. These findings enabled the adoption of an effective machine learning method, support vector machine, to train a predictive model with not only the amino acid composition, but also the composition of k-spaced amino acid pairs. After the selection of the best model using a ten-fold cross-validation approach, the selected model significantly outperformed existing tools based on an independent dataset manually extracted from published research articles. Finally, the selected model was used to develop a web-based tool, SuccSite, to aid the study of protein succinylation. Two proteins were used as case studies on the website to demonstrate the effective prediction of succinylation sites. We will regularly update SuccSite by integrating more experimental datasets. SuccSite is freely accessible at http://csb.cse.yzu.edu.tw/SuccSite/. Elsevier 2020-04 2020-06-24 /pmc/articles/PMC7647693/ /pubmed/32592791 http://dx.doi.org/10.1016/j.gpb.2018.10.010 Text en © 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Method
Kao, Hui-Ju
Nguyen, Van-Nui
Huang, Kai-Yao
Chang, Wen-Chi
Lee, Tzong-Yi
SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites
title SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites
title_full SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites
title_fullStr SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites
title_full_unstemmed SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites
title_short SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites
title_sort succsite: incorporating amino acid composition and informative k-spaced amino acid pairs to identify protein succinylation sites
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7647693/
https://www.ncbi.nlm.nih.gov/pubmed/32592791
http://dx.doi.org/10.1016/j.gpb.2018.10.010
work_keys_str_mv AT kaohuiju succsiteincorporatingaminoacidcompositionandinformativekspacedaminoacidpairstoidentifyproteinsuccinylationsites
AT nguyenvannui succsiteincorporatingaminoacidcompositionandinformativekspacedaminoacidpairstoidentifyproteinsuccinylationsites
AT huangkaiyao succsiteincorporatingaminoacidcompositionandinformativekspacedaminoacidpairstoidentifyproteinsuccinylationsites
AT changwenchi succsiteincorporatingaminoacidcompositionandinformativekspacedaminoacidpairstoidentifyproteinsuccinylationsites
AT leetzongyi succsiteincorporatingaminoacidcompositionandinformativekspacedaminoacidpairstoidentifyproteinsuccinylationsites