Cargando…

An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP

Recent studies have increasingly shown that the chemical modification of mRNA plays an important role in the regulation of gene expression. N(7)-methylguanosine (m7G) is a type of positively-charged mRNA modification that plays an essential role for efficient gene expression and cell viability. Howe...

Descripción completa

Detalles Bibliográficos
Autores principales: Bi, Yue, Xiang, Dongxu, Ge, Zongyuan, Li, Fuyi, Jia, Cangzhi, Song, Jiangning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Gene & Cell Therapy 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7533297/
https://www.ncbi.nlm.nih.gov/pubmed/33230441
http://dx.doi.org/10.1016/j.omtn.2020.08.022
_version_ 1783590101745401856
author Bi, Yue
Xiang, Dongxu
Ge, Zongyuan
Li, Fuyi
Jia, Cangzhi
Song, Jiangning
author_facet Bi, Yue
Xiang, Dongxu
Ge, Zongyuan
Li, Fuyi
Jia, Cangzhi
Song, Jiangning
author_sort Bi, Yue
collection PubMed
description Recent studies have increasingly shown that the chemical modification of mRNA plays an important role in the regulation of gene expression. N(7)-methylguanosine (m7G) is a type of positively-charged mRNA modification that plays an essential role for efficient gene expression and cell viability. However, the research on m7G has received little attention to date. Bioinformatics tools can be applied as auxiliary methods to identify m7G sites in transcriptomes. In this study, we develop a novel interpretable machine learning-based approach termed XG-m7G for the differentiation of m7G sites using the XGBoost algorithm and six different types of sequence-encoding schemes. Both 10-fold and jackknife cross-validation tests indicate that XG-m7G outperforms iRNA-m7G. Moreover, using the powerful SHAP algorithm, this new framework also provides desirable interpretations of the model performance and highlights the most important features for identifying m7G sites. XG-m7G is anticipated to serve as a useful tool and guide for researchers in their future studies of mRNA modification sites.
format Online
Article
Text
id pubmed-7533297
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society of Gene & Cell Therapy
record_format MEDLINE/PubMed
spelling pubmed-75332972020-10-16 An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP Bi, Yue Xiang, Dongxu Ge, Zongyuan Li, Fuyi Jia, Cangzhi Song, Jiangning Mol Ther Nucleic Acids Original Article Recent studies have increasingly shown that the chemical modification of mRNA plays an important role in the regulation of gene expression. N(7)-methylguanosine (m7G) is a type of positively-charged mRNA modification that plays an essential role for efficient gene expression and cell viability. However, the research on m7G has received little attention to date. Bioinformatics tools can be applied as auxiliary methods to identify m7G sites in transcriptomes. In this study, we develop a novel interpretable machine learning-based approach termed XG-m7G for the differentiation of m7G sites using the XGBoost algorithm and six different types of sequence-encoding schemes. Both 10-fold and jackknife cross-validation tests indicate that XG-m7G outperforms iRNA-m7G. Moreover, using the powerful SHAP algorithm, this new framework also provides desirable interpretations of the model performance and highlights the most important features for identifying m7G sites. XG-m7G is anticipated to serve as a useful tool and guide for researchers in their future studies of mRNA modification sites. American Society of Gene & Cell Therapy 2020-08-25 /pmc/articles/PMC7533297/ /pubmed/33230441 http://dx.doi.org/10.1016/j.omtn.2020.08.022 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Article
Bi, Yue
Xiang, Dongxu
Ge, Zongyuan
Li, Fuyi
Jia, Cangzhi
Song, Jiangning
An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP
title An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP
title_full An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP
title_fullStr An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP
title_full_unstemmed An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP
title_short An Interpretable Prediction Model for Identifying N(7)-Methylguanosine Sites Based on XGBoost and SHAP
title_sort interpretable prediction model for identifying n(7)-methylguanosine sites based on xgboost and shap
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7533297/
https://www.ncbi.nlm.nih.gov/pubmed/33230441
http://dx.doi.org/10.1016/j.omtn.2020.08.022
work_keys_str_mv AT biyue aninterpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT xiangdongxu aninterpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT gezongyuan aninterpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT lifuyi aninterpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT jiacangzhi aninterpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT songjiangning aninterpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT biyue interpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT xiangdongxu interpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT gezongyuan interpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT lifuyi interpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT jiacangzhi interpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap
AT songjiangning interpretablepredictionmodelforidentifyingn7methylguanosinesitesbasedonxgboostandshap