Cargando…

Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection

Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Xiaowei, Li, Xiangtao, Ma, Zhiqiang, Yin, Minghao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Molecular Diversity Preservation International (MDPI) 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3257073/
https://www.ncbi.nlm.nih.gov/pubmed/22272076
http://dx.doi.org/10.3390/ijms12128347
_version_ 1782221096959868928
author Zhao, Xiaowei
Li, Xiangtao
Ma, Zhiqiang
Yin, Minghao
author_facet Zhao, Xiaowei
Li, Xiangtao
Ma, Zhiqiang
Yin, Minghao
author_sort Zhao, Xiaowei
collection PubMed
description Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method to effectively identify the lysine ubiquitylation sites based on the ensemble approach. In the proposed method, 468 ubiquitylation sites from 323 proteins retrieved from the Swiss-Prot database were encoded into feature vectors by using four kinds of protein sequences information. An effective feature selection method was then applied to extract informative feature subsets. After different feature subsets were obtained by setting different starting points in the search procedure, they were used to train multiple random forests classifiers and then aggregated into a consensus classifier by majority voting. Evaluated by jackknife tests and independent tests respectively, the accuracy of the proposed predictor reached 76.82% for the training dataset and 79.16% for the test dataset, indicating that this predictor is a useful tool to predict lysine ubiquitylation sites. Furthermore, site-specific feature analysis was performed and it was shown that ubiquitylation is intimately correlated with the features of its surrounding sites in addition to features derived from the lysine site itself. The feature selection method is available upon request.
format Online
Article
Text
id pubmed-3257073
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Molecular Diversity Preservation International (MDPI)
record_format MEDLINE/PubMed
spelling pubmed-32570732012-01-23 Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection Zhao, Xiaowei Li, Xiangtao Ma, Zhiqiang Yin, Minghao Int J Mol Sci Article Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method to effectively identify the lysine ubiquitylation sites based on the ensemble approach. In the proposed method, 468 ubiquitylation sites from 323 proteins retrieved from the Swiss-Prot database were encoded into feature vectors by using four kinds of protein sequences information. An effective feature selection method was then applied to extract informative feature subsets. After different feature subsets were obtained by setting different starting points in the search procedure, they were used to train multiple random forests classifiers and then aggregated into a consensus classifier by majority voting. Evaluated by jackknife tests and independent tests respectively, the accuracy of the proposed predictor reached 76.82% for the training dataset and 79.16% for the test dataset, indicating that this predictor is a useful tool to predict lysine ubiquitylation sites. Furthermore, site-specific feature analysis was performed and it was shown that ubiquitylation is intimately correlated with the features of its surrounding sites in addition to features derived from the lysine site itself. The feature selection method is available upon request. Molecular Diversity Preservation International (MDPI) 2011-11-28 /pmc/articles/PMC3257073/ /pubmed/22272076 http://dx.doi.org/10.3390/ijms12128347 Text en © 2011 by the authors; licensee MDPI, Basel, Switzerland. http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Zhao, Xiaowei
Li, Xiangtao
Ma, Zhiqiang
Yin, Minghao
Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection
title Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection
title_full Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection
title_fullStr Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection
title_full_unstemmed Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection
title_short Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection
title_sort prediction of lysine ubiquitylation with ensemble classifier and feature selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3257073/
https://www.ncbi.nlm.nih.gov/pubmed/22272076
http://dx.doi.org/10.3390/ijms12128347
work_keys_str_mv AT zhaoxiaowei predictionoflysineubiquitylationwithensembleclassifierandfeatureselection
AT lixiangtao predictionoflysineubiquitylationwithensembleclassifierandfeatureselection
AT mazhiqiang predictionoflysineubiquitylationwithensembleclassifierandfeatureselection
AT yinminghao predictionoflysineubiquitylationwithensembleclassifierandfeatureselection