Cargando…

Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features

[Image: see text] Arginine methylation is one of the most essential protein post-translational modifications. Identifying the site of arginine methylation is a critical problem in biology research. Unfortunately, biological experiments such as mass spectrometry are expensive and time-consuming. Henc...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Ruiyan, Wu, Jin, Xu, Lei, Zou, Quan, Wu, Yi-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2020
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594152/
https://www.ncbi.nlm.nih.gov/pubmed/33134710
http://dx.doi.org/10.1021/acsomega.0c03972
_version_ 1783601569380433920
author Hou, Ruiyan
Wu, Jin
Xu, Lei
Zou, Quan
Wu, Yi-Jun
author_facet Hou, Ruiyan
Wu, Jin
Xu, Lei
Zou, Quan
Wu, Yi-Jun
author_sort Hou, Ruiyan
collection PubMed
description [Image: see text] Arginine methylation is one of the most essential protein post-translational modifications. Identifying the site of arginine methylation is a critical problem in biology research. Unfortunately, biological experiments such as mass spectrometry are expensive and time-consuming. Hence, predicting arginine methylation by machine learning is an alternative fast and efficient way. In this paper, we focus on the systematic characterization of arginine methylation with composition–transition–distribution (CTD) features. The presented framework consists of three stages. In the first stage, we extract CTD features from 1750 samples and exploit decision tree to generate accurate prediction. The accuracy of prediction can reach 96%. In the second stage, the support vector machine can predict the number of arginine methylation sites with 0.36 R-squared. In the third stage, experiments carried out with the updated arginine methylation site data set show that utilizing CTD features and adopting random forest as the classifier outperform previous methods. The accuracy of identification can reach 82.1 and 82.5% in single methylarginine and double methylarginine data sets, respectively. The discovery presented in this paper can be helpful for future research on arginine methylation.
format Online
Article
Text
id pubmed-7594152
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-75941522020-10-30 Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features Hou, Ruiyan Wu, Jin Xu, Lei Zou, Quan Wu, Yi-Jun ACS Omega [Image: see text] Arginine methylation is one of the most essential protein post-translational modifications. Identifying the site of arginine methylation is a critical problem in biology research. Unfortunately, biological experiments such as mass spectrometry are expensive and time-consuming. Hence, predicting arginine methylation by machine learning is an alternative fast and efficient way. In this paper, we focus on the systematic characterization of arginine methylation with composition–transition–distribution (CTD) features. The presented framework consists of three stages. In the first stage, we extract CTD features from 1750 samples and exploit decision tree to generate accurate prediction. The accuracy of prediction can reach 96%. In the second stage, the support vector machine can predict the number of arginine methylation sites with 0.36 R-squared. In the third stage, experiments carried out with the updated arginine methylation site data set show that utilizing CTD features and adopting random forest as the classifier outperform previous methods. The accuracy of identification can reach 82.1 and 82.5% in single methylarginine and double methylarginine data sets, respectively. The discovery presented in this paper can be helpful for future research on arginine methylation. American Chemical Society 2020-10-19 /pmc/articles/PMC7594152/ /pubmed/33134710 http://dx.doi.org/10.1021/acsomega.0c03972 Text en © 2020 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes.
spellingShingle Hou, Ruiyan
Wu, Jin
Xu, Lei
Zou, Quan
Wu, Yi-Jun
Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
title Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
title_full Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
title_fullStr Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
title_full_unstemmed Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
title_short Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
title_sort computational prediction of protein arginine methylation based on composition–transition–distribution features
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594152/
https://www.ncbi.nlm.nih.gov/pubmed/33134710
http://dx.doi.org/10.1021/acsomega.0c03972
work_keys_str_mv AT houruiyan computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures
AT wujin computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures
AT xulei computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures
AT zouquan computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures
AT wuyijun computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures