Cargando…

Recognition of Protein Pupylation Sites by Adopting Resampling Approach

With the in-depth study of posttranslational modification sites, protein ubiquitination has become the key problem to study the molecular mechanism of posttranslational modification. Pupylation is a widely used process in which a prokaryotic ubiquitin-like protein (Pup) is attached to a substrate th...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Tao, Chen, Yan, Li, Taoying, Jia, Cangzhi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6321382/
https://www.ncbi.nlm.nih.gov/pubmed/30486421
http://dx.doi.org/10.3390/molecules23123097
_version_ 1783385429136900096
author Li, Tao
Chen, Yan
Li, Taoying
Jia, Cangzhi
author_facet Li, Tao
Chen, Yan
Li, Taoying
Jia, Cangzhi
author_sort Li, Tao
collection PubMed
description With the in-depth study of posttranslational modification sites, protein ubiquitination has become the key problem to study the molecular mechanism of posttranslational modification. Pupylation is a widely used process in which a prokaryotic ubiquitin-like protein (Pup) is attached to a substrate through a series of biochemical reactions. However, the experimental methods of identifying pupylation sites is often time-consuming and laborious. This study aims to propose an improved approach for predicting pupylation sites. Firstly, the Pearson correlation coefficient was used to reflect the correlation among different amino acid pairs calculated by the frequency of each amino acid. Then according to a descending ranked order, the multiple types of features were filtered separately by values of Pearson correlation coefficient. Thirdly, to get a qualified balanced dataset, the K-means principal component analysis (KPCA) oversampling technique was employed to synthesize new positive samples and Fuzzy undersampling method was employed to reduce the number of negative samples. Finally, the performance of our method was verified by means of jackknife and a 10-fold cross-validation test. The average results of 10-fold cross-validation showed that the sensitivity (Sn) was 90.53%, specificity (Sp) was 99.8%, accuracy (Acc) was 95.09%, and Matthews Correlation Coefficient (MCC) was 0.91. Moreover, an independent test dataset was used to further measure its performance, and the prediction results achieved the Acc of 83.75%, MCC of 0.49, which was superior to previous predictors. The better performance and stability of our proposed method showed it is an effective way to predict pupylation sites.
format Online
Article
Text
id pubmed-6321382
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-63213822019-01-14 Recognition of Protein Pupylation Sites by Adopting Resampling Approach Li, Tao Chen, Yan Li, Taoying Jia, Cangzhi Molecules Article With the in-depth study of posttranslational modification sites, protein ubiquitination has become the key problem to study the molecular mechanism of posttranslational modification. Pupylation is a widely used process in which a prokaryotic ubiquitin-like protein (Pup) is attached to a substrate through a series of biochemical reactions. However, the experimental methods of identifying pupylation sites is often time-consuming and laborious. This study aims to propose an improved approach for predicting pupylation sites. Firstly, the Pearson correlation coefficient was used to reflect the correlation among different amino acid pairs calculated by the frequency of each amino acid. Then according to a descending ranked order, the multiple types of features were filtered separately by values of Pearson correlation coefficient. Thirdly, to get a qualified balanced dataset, the K-means principal component analysis (KPCA) oversampling technique was employed to synthesize new positive samples and Fuzzy undersampling method was employed to reduce the number of negative samples. Finally, the performance of our method was verified by means of jackknife and a 10-fold cross-validation test. The average results of 10-fold cross-validation showed that the sensitivity (Sn) was 90.53%, specificity (Sp) was 99.8%, accuracy (Acc) was 95.09%, and Matthews Correlation Coefficient (MCC) was 0.91. Moreover, an independent test dataset was used to further measure its performance, and the prediction results achieved the Acc of 83.75%, MCC of 0.49, which was superior to previous predictors. The better performance and stability of our proposed method showed it is an effective way to predict pupylation sites. MDPI 2018-11-27 /pmc/articles/PMC6321382/ /pubmed/30486421 http://dx.doi.org/10.3390/molecules23123097 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Tao
Chen, Yan
Li, Taoying
Jia, Cangzhi
Recognition of Protein Pupylation Sites by Adopting Resampling Approach
title Recognition of Protein Pupylation Sites by Adopting Resampling Approach
title_full Recognition of Protein Pupylation Sites by Adopting Resampling Approach
title_fullStr Recognition of Protein Pupylation Sites by Adopting Resampling Approach
title_full_unstemmed Recognition of Protein Pupylation Sites by Adopting Resampling Approach
title_short Recognition of Protein Pupylation Sites by Adopting Resampling Approach
title_sort recognition of protein pupylation sites by adopting resampling approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6321382/
https://www.ncbi.nlm.nih.gov/pubmed/30486421
http://dx.doi.org/10.3390/molecules23123097
work_keys_str_mv AT litao recognitionofproteinpupylationsitesbyadoptingresamplingapproach
AT chenyan recognitionofproteinpupylationsitesbyadoptingresamplingapproach
AT litaoying recognitionofproteinpupylationsitesbyadoptingresamplingapproach
AT jiacangzhi recognitionofproteinpupylationsitesbyadoptingresamplingapproach