Cargando…

m5CPred-SVM: a novel method for predicting m5C sites of RNA

BACKGROUND: As one of the most common post-transcriptional modifications (PTCM) in RNA, 5-cytosine-methylation plays important roles in many biological functions such as RNA metabolism and cell fate decision. Through accurate identification of 5-methylcytosine (m5C) sites on RNA, researchers can bet...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xiao, Xiong, Yi, Liu, Yinbo, Chen, Yuqing, Bi, Shoudong, Zhu, Xiaolei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7602301/
https://www.ncbi.nlm.nih.gov/pubmed/33126851
http://dx.doi.org/10.1186/s12859-020-03828-4
_version_ 1783603646612635648
author Chen, Xiao
Xiong, Yi
Liu, Yinbo
Chen, Yuqing
Bi, Shoudong
Zhu, Xiaolei
author_facet Chen, Xiao
Xiong, Yi
Liu, Yinbo
Chen, Yuqing
Bi, Shoudong
Zhu, Xiaolei
author_sort Chen, Xiao
collection PubMed
description BACKGROUND: As one of the most common post-transcriptional modifications (PTCM) in RNA, 5-cytosine-methylation plays important roles in many biological functions such as RNA metabolism and cell fate decision. Through accurate identification of 5-methylcytosine (m5C) sites on RNA, researchers can better understand the exact role of 5-cytosine-methylation in these biological functions. In recent years, computational methods of predicting m5C sites have attracted lots of interests because of its efficiency and low-cost. However, both the accuracy and efficiency of these methods are not satisfactory yet and need further improvement. RESULTS: In this work, we have developed a new computational method, m5CPred-SVM, to identify m5C sites in three species, H. sapiens, M. musculus and A. thaliana. To build this model, we first collected benchmark datasets following three recently published methods. Then, six types of sequence-based features were generated based on RNA segments and the sequential forward feature selection strategy was used to obtain the optimal feature subset. After that, the performance of models based on different learning algorithms were compared, and the model based on the support vector machine provided the highest prediction accuracy. Finally, our proposed method, m5CPred-SVM was compared with several existing methods, and the result showed that m5CPred-SVM offered substantially higher prediction accuracy than previously published methods. It is expected that our method, m5CPred-SVM, can become a useful tool for accurate identification of m5C sites. CONCLUSION: In this study, by introducing position-specific propensity related features, we built a new model, m5CPred-SVM, to predict RNA m5C sites of three different species. The result shows that our model outperformed the existing state-of-art models. Our model is available for users through a web server at https://zhulab.ahu.edu.cn/m5CPred-SVM.
format Online
Article
Text
id pubmed-7602301
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76023012020-11-02 m5CPred-SVM: a novel method for predicting m5C sites of RNA Chen, Xiao Xiong, Yi Liu, Yinbo Chen, Yuqing Bi, Shoudong Zhu, Xiaolei BMC Bioinformatics Methodology Article BACKGROUND: As one of the most common post-transcriptional modifications (PTCM) in RNA, 5-cytosine-methylation plays important roles in many biological functions such as RNA metabolism and cell fate decision. Through accurate identification of 5-methylcytosine (m5C) sites on RNA, researchers can better understand the exact role of 5-cytosine-methylation in these biological functions. In recent years, computational methods of predicting m5C sites have attracted lots of interests because of its efficiency and low-cost. However, both the accuracy and efficiency of these methods are not satisfactory yet and need further improvement. RESULTS: In this work, we have developed a new computational method, m5CPred-SVM, to identify m5C sites in three species, H. sapiens, M. musculus and A. thaliana. To build this model, we first collected benchmark datasets following three recently published methods. Then, six types of sequence-based features were generated based on RNA segments and the sequential forward feature selection strategy was used to obtain the optimal feature subset. After that, the performance of models based on different learning algorithms were compared, and the model based on the support vector machine provided the highest prediction accuracy. Finally, our proposed method, m5CPred-SVM was compared with several existing methods, and the result showed that m5CPred-SVM offered substantially higher prediction accuracy than previously published methods. It is expected that our method, m5CPred-SVM, can become a useful tool for accurate identification of m5C sites. CONCLUSION: In this study, by introducing position-specific propensity related features, we built a new model, m5CPred-SVM, to predict RNA m5C sites of three different species. The result shows that our model outperformed the existing state-of-art models. Our model is available for users through a web server at https://zhulab.ahu.edu.cn/m5CPred-SVM. BioMed Central 2020-10-30 /pmc/articles/PMC7602301/ /pubmed/33126851 http://dx.doi.org/10.1186/s12859-020-03828-4 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Chen, Xiao
Xiong, Yi
Liu, Yinbo
Chen, Yuqing
Bi, Shoudong
Zhu, Xiaolei
m5CPred-SVM: a novel method for predicting m5C sites of RNA
title m5CPred-SVM: a novel method for predicting m5C sites of RNA
title_full m5CPred-SVM: a novel method for predicting m5C sites of RNA
title_fullStr m5CPred-SVM: a novel method for predicting m5C sites of RNA
title_full_unstemmed m5CPred-SVM: a novel method for predicting m5C sites of RNA
title_short m5CPred-SVM: a novel method for predicting m5C sites of RNA
title_sort m5cpred-svm: a novel method for predicting m5c sites of rna
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7602301/
https://www.ncbi.nlm.nih.gov/pubmed/33126851
http://dx.doi.org/10.1186/s12859-020-03828-4
work_keys_str_mv AT chenxiao m5cpredsvmanovelmethodforpredictingm5csitesofrna
AT xiongyi m5cpredsvmanovelmethodforpredictingm5csitesofrna
AT liuyinbo m5cpredsvmanovelmethodforpredictingm5csitesofrna
AT chenyuqing m5cpredsvmanovelmethodforpredictingm5csitesofrna
AT bishoudong m5cpredsvmanovelmethodforpredictingm5csitesofrna
AT zhuxiaolei m5cpredsvmanovelmethodforpredictingm5csitesofrna