Cargando…
m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP
As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005994/ https://www.ncbi.nlm.nih.gov/pubmed/35432446 http://dx.doi.org/10.3389/fgene.2022.853258 |
_version_ | 1784686580079788032 |
---|---|
author | Liu, Yinbo Shen, Yingying Wang, Hong Zhang, Yong Zhu, Xiaolei |
author_facet | Liu, Yinbo Shen, Yingying Wang, Hong Zhang, Yong Zhu, Xiaolei |
author_sort | Liu, Yinbo |
collection | PubMed |
description | As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy and efficiency are still not satisfactory. In this study, we proposed a new method, m5Cpred-XS, for predicting m5C sites of H. sapiens, M. musculus, and A. thaliana. First, the powerful SHAP method was used to select the optimal feature subset from seven different kinds of sequence-based features. Second, different machine learning algorithms were used to train the models. The results of five-fold cross-validation indicate that the model based on XGBoost achieved the highest prediction accuracy. Finally, our model was compared with other state-of-the-art models, which indicates that m5Cpred-XS is superior to other methods. Moreover, we deployed the model on a web server that can be accessed through http://m5cpred-xs.zhulab.org.cn/, and m5Cpred-XS is expected to be a useful tool for studying m5C sites. |
format | Online Article Text |
id | pubmed-9005994 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90059942022-04-14 m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP Liu, Yinbo Shen, Yingying Wang, Hong Zhang, Yong Zhu, Xiaolei Front Genet Genetics As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy and efficiency are still not satisfactory. In this study, we proposed a new method, m5Cpred-XS, for predicting m5C sites of H. sapiens, M. musculus, and A. thaliana. First, the powerful SHAP method was used to select the optimal feature subset from seven different kinds of sequence-based features. Second, different machine learning algorithms were used to train the models. The results of five-fold cross-validation indicate that the model based on XGBoost achieved the highest prediction accuracy. Finally, our model was compared with other state-of-the-art models, which indicates that m5Cpred-XS is superior to other methods. Moreover, we deployed the model on a web server that can be accessed through http://m5cpred-xs.zhulab.org.cn/, and m5Cpred-XS is expected to be a useful tool for studying m5C sites. Frontiers Media S.A. 2022-03-30 /pmc/articles/PMC9005994/ /pubmed/35432446 http://dx.doi.org/10.3389/fgene.2022.853258 Text en Copyright © 2022 Liu, Shen, Wang, Zhang and Zhu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Liu, Yinbo Shen, Yingying Wang, Hong Zhang, Yong Zhu, Xiaolei m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP |
title | m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP |
title_full | m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP |
title_fullStr | m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP |
title_full_unstemmed | m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP |
title_short | m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP |
title_sort | m5cpred-xs: a new method for predicting rna m5c sites based on xgboost and shap |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005994/ https://www.ncbi.nlm.nih.gov/pubmed/35432446 http://dx.doi.org/10.3389/fgene.2022.853258 |
work_keys_str_mv | AT liuyinbo m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap AT shenyingying m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap AT wanghong m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap AT zhangyong m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap AT zhuxiaolei m5cpredxsanewmethodforpredictingrnam5csitesbasedonxgboostandshap |