Cargando…

m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP

As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yinbo, Shen, Yingying, Wang, Hong, Zhang, Yong, Zhu, Xiaolei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005994/
https://www.ncbi.nlm.nih.gov/pubmed/35432446
http://dx.doi.org/10.3389/fgene.2022.853258
_version_ 1784686580079788032
author Liu, Yinbo
Shen, Yingying
Wang, Hong
Zhang, Yong
Zhu, Xiaolei
author_facet Liu, Yinbo
Shen, Yingying
Wang, Hong
Zhang, Yong
Zhu, Xiaolei
author_sort Liu, Yinbo
collection PubMed
description As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy and efficiency are still not satisfactory. In this study, we proposed a new method, m5Cpred-XS, for predicting m5C sites of H. sapiens, M. musculus, and A. thaliana. First, the powerful SHAP method was used to select the optimal feature subset from seven different kinds of sequence-based features. Second, different machine learning algorithms were used to train the models. The results of five-fold cross-validation indicate that the model based on XGBoost achieved the highest prediction accuracy. Finally, our model was compared with other state-of-the-art models, which indicates that m5Cpred-XS is superior to other methods. Moreover, we deployed the model on a web server that can be accessed through http://m5cpred-xs.zhulab.org.cn/, and m5Cpred-XS is expected to be a useful tool for studying m5C sites.
format Online
Article
Text
id pubmed-9005994
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-90059942022-04-14 m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP Liu, Yinbo Shen, Yingying Wang, Hong Zhang, Yong Zhu, Xiaolei Front Genet Genetics As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy and efficiency are still not satisfactory. In this study, we proposed a new method, m5Cpred-XS, for predicting m5C sites of H. sapiens, M. musculus, and A. thaliana. First, the powerful SHAP method was used to select the optimal feature subset from seven different kinds of sequence-based features. Second, different machine learning algorithms were used to train the models. The results of five-fold cross-validation indicate that the model based on XGBoost achieved the highest prediction accuracy. Finally, our model was compared with other state-of-the-art models, which indicates that m5Cpred-XS is superior to other methods. Moreover, we deployed the model on a web server that can be accessed through http://m5cpred-xs.zhulab.org.cn/, and m5Cpred-XS is expected to be a useful tool for studying m5C sites. Frontiers Media S.A. 2022-03-30 /pmc/articles/PMC9005994/ /pubmed/35432446 http://dx.doi.org/10.3389/fgene.2022.853258 Text en Copyright © 2022 Liu, Shen, Wang, Zhang and Zhu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Liu, Yinbo
Shen, Yingying
Wang, Hong
Zhang, Yong
Zhu, Xiaolei
m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
title m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
title_full m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
title_fullStr m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
title_full_unstemmed m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
title_short m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
title_sort m5cpred-xs: a new method for predicting rna m5c sites based on xgboost and shap
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005994/
https://www.ncbi.nlm.nih.gov/pubmed/35432446
http://dx.doi.org/10.3389/fgene.2022.853258
work_keys_str_mv AT liuyinbo m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap
AT shenyingying m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap
AT wanghong m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap
AT zhangyong m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap
AT zhuxiaolei m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap