Cargando…
Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features
[Image: see text] Arginine methylation is one of the most essential protein post-translational modifications. Identifying the site of arginine methylation is a critical problem in biology research. Unfortunately, biological experiments such as mass spectrometry are expensive and time-consuming. Henc...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2020
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594152/ https://www.ncbi.nlm.nih.gov/pubmed/33134710 http://dx.doi.org/10.1021/acsomega.0c03972 |
_version_ | 1783601569380433920 |
---|---|
author | Hou, Ruiyan Wu, Jin Xu, Lei Zou, Quan Wu, Yi-Jun |
author_facet | Hou, Ruiyan Wu, Jin Xu, Lei Zou, Quan Wu, Yi-Jun |
author_sort | Hou, Ruiyan |
collection | PubMed |
description | [Image: see text] Arginine methylation is one of the most essential protein post-translational modifications. Identifying the site of arginine methylation is a critical problem in biology research. Unfortunately, biological experiments such as mass spectrometry are expensive and time-consuming. Hence, predicting arginine methylation by machine learning is an alternative fast and efficient way. In this paper, we focus on the systematic characterization of arginine methylation with composition–transition–distribution (CTD) features. The presented framework consists of three stages. In the first stage, we extract CTD features from 1750 samples and exploit decision tree to generate accurate prediction. The accuracy of prediction can reach 96%. In the second stage, the support vector machine can predict the number of arginine methylation sites with 0.36 R-squared. In the third stage, experiments carried out with the updated arginine methylation site data set show that utilizing CTD features and adopting random forest as the classifier outperform previous methods. The accuracy of identification can reach 82.1 and 82.5% in single methylarginine and double methylarginine data sets, respectively. The discovery presented in this paper can be helpful for future research on arginine methylation. |
format | Online Article Text |
id | pubmed-7594152 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-75941522020-10-30 Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features Hou, Ruiyan Wu, Jin Xu, Lei Zou, Quan Wu, Yi-Jun ACS Omega [Image: see text] Arginine methylation is one of the most essential protein post-translational modifications. Identifying the site of arginine methylation is a critical problem in biology research. Unfortunately, biological experiments such as mass spectrometry are expensive and time-consuming. Hence, predicting arginine methylation by machine learning is an alternative fast and efficient way. In this paper, we focus on the systematic characterization of arginine methylation with composition–transition–distribution (CTD) features. The presented framework consists of three stages. In the first stage, we extract CTD features from 1750 samples and exploit decision tree to generate accurate prediction. The accuracy of prediction can reach 96%. In the second stage, the support vector machine can predict the number of arginine methylation sites with 0.36 R-squared. In the third stage, experiments carried out with the updated arginine methylation site data set show that utilizing CTD features and adopting random forest as the classifier outperform previous methods. The accuracy of identification can reach 82.1 and 82.5% in single methylarginine and double methylarginine data sets, respectively. The discovery presented in this paper can be helpful for future research on arginine methylation. American Chemical Society 2020-10-19 /pmc/articles/PMC7594152/ /pubmed/33134710 http://dx.doi.org/10.1021/acsomega.0c03972 Text en © 2020 American Chemical Society This is an open access article published under an ACS AuthorChoice License (http://pubs.acs.org/page/policy/authorchoice_termsofuse.html) , which permits copying and redistribution of the article or any adaptations for non-commercial purposes. |
spellingShingle | Hou, Ruiyan Wu, Jin Xu, Lei Zou, Quan Wu, Yi-Jun Computational Prediction of Protein Arginine Methylation Based on Composition–Transition–Distribution Features |
title | Computational Prediction of Protein Arginine Methylation
Based on Composition–Transition–Distribution Features |
title_full | Computational Prediction of Protein Arginine Methylation
Based on Composition–Transition–Distribution Features |
title_fullStr | Computational Prediction of Protein Arginine Methylation
Based on Composition–Transition–Distribution Features |
title_full_unstemmed | Computational Prediction of Protein Arginine Methylation
Based on Composition–Transition–Distribution Features |
title_short | Computational Prediction of Protein Arginine Methylation
Based on Composition–Transition–Distribution Features |
title_sort | computational prediction of protein arginine methylation
based on composition–transition–distribution features |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594152/ https://www.ncbi.nlm.nih.gov/pubmed/33134710 http://dx.doi.org/10.1021/acsomega.0c03972 |
work_keys_str_mv | AT houruiyan computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures AT wujin computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures AT xulei computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures AT zouquan computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures AT wuyijun computationalpredictionofproteinargininemethylationbasedoncompositiontransitiondistributionfeatures |