Cargando…
MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs
BACKGROUND: Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic l...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5763492/ https://www.ncbi.nlm.nih.gov/pubmed/29322938 http://dx.doi.org/10.1186/s12918-017-0511-4 |
_version_ | 1783291897582714880 |
---|---|
author | Kao, Hui-Ju Weng, Shun-Long Huang, Kai-Yao Kaunang, Fergie Joanda Hsu, Justin Bo-Kai Huang, Chien-Hsun Lee, Tzong-Yi |
author_facet | Kao, Hui-Ju Weng, Shun-Long Huang, Kai-Yao Kaunang, Fergie Joanda Hsu, Justin Bo-Kai Huang, Chien-Hsun Lee, Tzong-Yi |
author_sort | Kao, Hui-Ju |
collection | PubMed |
description | BACKGROUND: Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson’s disease, and Alzheimer’s disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures. RESULTS: By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing. CONCLUSION: This study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/) and are also anticipated to facilitate the study of large-scale carbonylated proteomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12918-017-0511-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5763492 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57634922018-01-17 MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs Kao, Hui-Ju Weng, Shun-Long Huang, Kai-Yao Kaunang, Fergie Joanda Hsu, Justin Bo-Kai Huang, Chien-Hsun Lee, Tzong-Yi BMC Syst Biol Research BACKGROUND: Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson’s disease, and Alzheimer’s disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures. RESULTS: By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing. CONCLUSION: This study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/) and are also anticipated to facilitate the study of large-scale carbonylated proteomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12918-017-0511-4) contains supplementary material, which is available to authorized users. BioMed Central 2017-12-21 /pmc/articles/PMC5763492/ /pubmed/29322938 http://dx.doi.org/10.1186/s12918-017-0511-4 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Kao, Hui-Ju Weng, Shun-Long Huang, Kai-Yao Kaunang, Fergie Joanda Hsu, Justin Bo-Kai Huang, Chien-Hsun Lee, Tzong-Yi MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
title | MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
title_full | MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
title_fullStr | MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
title_full_unstemmed | MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
title_short | MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
title_sort | mdd-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5763492/ https://www.ncbi.nlm.nih.gov/pubmed/29322938 http://dx.doi.org/10.1186/s12918-017-0511-4 |
work_keys_str_mv | AT kaohuiju mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs AT wengshunlong mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs AT huangkaiyao mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs AT kaunangfergiejoanda mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs AT hsujustinbokai mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs AT huangchienhsun mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs AT leetzongyi mddcarbacombinatorialmodelfortheidentificationofproteincarbonylationsiteswithsubstratemotifs |