Cargando…

MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition

S-palmitoylation, the covalent attachment of 16-carbon palmitic acids to a cysteine residue via a thioester linkage, is an important reversible lipid modification that plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified S-palmito...

Descripción completa

Detalles Bibliográficos
Autores principales: Weng, Shun-Long, Kao, Hui-Ju, Huang, Chien-Hsun, Lee, Tzong-Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5491019/
https://www.ncbi.nlm.nih.gov/pubmed/28662047
http://dx.doi.org/10.1371/journal.pone.0179529
_version_ 1783247064108367872
author Weng, Shun-Long
Kao, Hui-Ju
Huang, Chien-Hsun
Lee, Tzong-Yi
author_facet Weng, Shun-Long
Kao, Hui-Ju
Huang, Chien-Hsun
Lee, Tzong-Yi
author_sort Weng, Shun-Long
collection PubMed
description S-palmitoylation, the covalent attachment of 16-carbon palmitic acids to a cysteine residue via a thioester linkage, is an important reversible lipid modification that plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified S-palmitoylated peptides increases, it is imperative to investigate substrate motifs to facilitate the study of protein S-palmitoylation. Based on 710 non-homologous S-palmitoylation sites obtained from published databases and the literature, we carried out a bioinformatics investigation of S-palmitoylation sites based on amino acid composition. Two Sample Logo indicates that positively charged and polar amino acids surrounding S-palmitoylated sites may be associated with the substrate site specificity of protein S-palmitoylation. Additionally, maximal dependence decomposition (MDD) was applied to explore the motif signatures of S-palmitoylation sites by categorizing a large-scale dataset into subgroups with statistically significant conservation of amino acids. Single features such as amino acid composition (AAC), amino acid pair composition (AAPC), position specific scoring matrix (PSSM), position weight matrix (PWM), amino acid substitution matrix (BLOSUM62), and accessible surface area (ASA) were considered, along with the effectiveness of incorporating MDD-identified substrate motifs into a two-layered prediction model. Evaluation by five-fold cross-validation showed that a hybrid of AAC and PSSM performs best at discriminating between S-palmitoylation and non-S-palmitoylation sites, according to the support vector machine (SVM). The two-layered SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.79, specificity of 0.80, accuracy of 0.80, and Matthews Correlation Coefficient (MCC) value of 0.45. Using an independent testing dataset (613 S-palmitoylated and 5412 non-S-palmitoylated sites) obtained from the literature, we demonstrated that the two-layered SVM model could outperform other prediction tools, yielding a balanced sensitivity and specificity of 0.690 and 0.694, respectively. This two-layered SVM model has been implemented as a web-based system (MDD-Palm), which is now freely available at http://csb.cse.yzu.edu.tw/MDDPalm/.
format Online
Article
Text
id pubmed-5491019
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54910192017-07-18 MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition Weng, Shun-Long Kao, Hui-Ju Huang, Chien-Hsun Lee, Tzong-Yi PLoS One Research Article S-palmitoylation, the covalent attachment of 16-carbon palmitic acids to a cysteine residue via a thioester linkage, is an important reversible lipid modification that plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified S-palmitoylated peptides increases, it is imperative to investigate substrate motifs to facilitate the study of protein S-palmitoylation. Based on 710 non-homologous S-palmitoylation sites obtained from published databases and the literature, we carried out a bioinformatics investigation of S-palmitoylation sites based on amino acid composition. Two Sample Logo indicates that positively charged and polar amino acids surrounding S-palmitoylated sites may be associated with the substrate site specificity of protein S-palmitoylation. Additionally, maximal dependence decomposition (MDD) was applied to explore the motif signatures of S-palmitoylation sites by categorizing a large-scale dataset into subgroups with statistically significant conservation of amino acids. Single features such as amino acid composition (AAC), amino acid pair composition (AAPC), position specific scoring matrix (PSSM), position weight matrix (PWM), amino acid substitution matrix (BLOSUM62), and accessible surface area (ASA) were considered, along with the effectiveness of incorporating MDD-identified substrate motifs into a two-layered prediction model. Evaluation by five-fold cross-validation showed that a hybrid of AAC and PSSM performs best at discriminating between S-palmitoylation and non-S-palmitoylation sites, according to the support vector machine (SVM). The two-layered SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.79, specificity of 0.80, accuracy of 0.80, and Matthews Correlation Coefficient (MCC) value of 0.45. Using an independent testing dataset (613 S-palmitoylated and 5412 non-S-palmitoylated sites) obtained from the literature, we demonstrated that the two-layered SVM model could outperform other prediction tools, yielding a balanced sensitivity and specificity of 0.690 and 0.694, respectively. This two-layered SVM model has been implemented as a web-based system (MDD-Palm), which is now freely available at http://csb.cse.yzu.edu.tw/MDDPalm/. Public Library of Science 2017-06-29 /pmc/articles/PMC5491019/ /pubmed/28662047 http://dx.doi.org/10.1371/journal.pone.0179529 Text en © 2017 Weng et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Weng, Shun-Long
Kao, Hui-Ju
Huang, Chien-Hsun
Lee, Tzong-Yi
MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition
title MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition
title_full MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition
title_fullStr MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition
title_full_unstemmed MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition
title_short MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition
title_sort mdd-palm: identification of protein s-palmitoylation sites with substrate motifs based on maximal dependence decomposition
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5491019/
https://www.ncbi.nlm.nih.gov/pubmed/28662047
http://dx.doi.org/10.1371/journal.pone.0179529
work_keys_str_mv AT wengshunlong mddpalmidentificationofproteinspalmitoylationsiteswithsubstratemotifsbasedonmaximaldependencedecomposition
AT kaohuiju mddpalmidentificationofproteinspalmitoylationsiteswithsubstratemotifsbasedonmaximaldependencedecomposition
AT huangchienhsun mddpalmidentificationofproteinspalmitoylationsiteswithsubstratemotifsbasedonmaximaldependencedecomposition
AT leetzongyi mddpalmidentificationofproteinspalmitoylationsiteswithsubstratemotifsbasedonmaximaldependencedecomposition