Cargando…

Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning

N6-methyladenosine (m(6)A) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA m(6)A transcriptome-wide at base resolution via...

Descripción completa

Detalles Bibliográficos
Autores principales:	Luo, Zhengtao, Lou, Liliang, Qiu, Wangren, Xu, Zhaochun, Xiao, Xuan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778682/ https://www.ncbi.nlm.nih.gov/pubmed/36555143 http://dx.doi.org/10.3390/ijms232415490

_version_	1784856423021150208
author	Luo, Zhengtao Lou, Liliang Qiu, Wangren Xu, Zhaochun Xiao, Xuan
author_facet	Luo, Zhengtao Lou, Liliang Qiu, Wangren Xu, Zhaochun Xiao, Xuan
author_sort	Luo, Zhengtao
collection	PubMed
description	N6-methyladenosine (m(6)A) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA m(6)A transcriptome-wide at base resolution via experimental approaches, which are generally time-consuming and expensive. Developing computational methods is a good strategy for accurate in silico detection of m(6)A modification sites from the large amount of RNA sequence data. Unfortunately, the existing computational models are usually only for m(6)A site prediction in a single species, without considering the tissue level of species, while most of them are constructed based on low-confidence level data generated by an m(6)A antibody immunoprecipitation (IP)-based sequencing method, thereby restricting reliability and generalizability of proposed models. Here, we review recent advances in computational prediction of m(6)A sites and construct a new computational approach named im6APred using ensemble deep learning to accurately identify m(6)A sites based on high-confidence level data in multiple tissues of mammals. Our model im6APred builds upon a comprehensive evaluation of multiple classification methods, including four traditional classification algorithms and three deep learning methods and their ensembles. The optimal base–classifier combinations are then chosen by five-fold cross-validation test to achieve an effective stacked model. Our model im6APred can produce the area under the receiver operating characteristic curve (AUROC) in the range of 0.82–0.91 on independent tests, indicating that our model has the ability to learn general methylation rules on RNA bases and generalize to m(6)A transcriptome-wide identification. Moreover, AUROCs in the range of 0.77–0.96 were achieved using cross-species/tissues validation on the benchmark dataset, demonstrating differences in predictive performance at the tissue level and the need for constructing tissue-specific models for m(6)A site prediction.
format	Online Article Text
id	pubmed-9778682
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-97786822022-12-23 Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning Luo, Zhengtao Lou, Liliang Qiu, Wangren Xu, Zhaochun Xiao, Xuan Int J Mol Sci Article N6-methyladenosine (m(6)A) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA m(6)A transcriptome-wide at base resolution via experimental approaches, which are generally time-consuming and expensive. Developing computational methods is a good strategy for accurate in silico detection of m(6)A modification sites from the large amount of RNA sequence data. Unfortunately, the existing computational models are usually only for m(6)A site prediction in a single species, without considering the tissue level of species, while most of them are constructed based on low-confidence level data generated by an m(6)A antibody immunoprecipitation (IP)-based sequencing method, thereby restricting reliability and generalizability of proposed models. Here, we review recent advances in computational prediction of m(6)A sites and construct a new computational approach named im6APred using ensemble deep learning to accurately identify m(6)A sites based on high-confidence level data in multiple tissues of mammals. Our model im6APred builds upon a comprehensive evaluation of multiple classification methods, including four traditional classification algorithms and three deep learning methods and their ensembles. The optimal base–classifier combinations are then chosen by five-fold cross-validation test to achieve an effective stacked model. Our model im6APred can produce the area under the receiver operating characteristic curve (AUROC) in the range of 0.82–0.91 on independent tests, indicating that our model has the ability to learn general methylation rules on RNA bases and generalize to m(6)A transcriptome-wide identification. Moreover, AUROCs in the range of 0.77–0.96 were achieved using cross-species/tissues validation on the benchmark dataset, demonstrating differences in predictive performance at the tissue level and the need for constructing tissue-specific models for m(6)A site prediction. MDPI 2022-12-07 /pmc/articles/PMC9778682/ /pubmed/36555143 http://dx.doi.org/10.3390/ijms232415490 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Luo, Zhengtao Lou, Liliang Qiu, Wangren Xu, Zhaochun Xiao, Xuan Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning
title	Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning
title_full	Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning
title_fullStr	Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning
title_full_unstemmed	Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning
title_short	Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning
title_sort	predicting n6-methyladenosine sites in multiple tissues of mammals through ensemble deep learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778682/ https://www.ncbi.nlm.nih.gov/pubmed/36555143 http://dx.doi.org/10.3390/ijms232415490
work_keys_str_mv	AT luozhengtao predictingn6methyladenosinesitesinmultipletissuesofmammalsthroughensembledeeplearning AT louliliang predictingn6methyladenosinesitesinmultipletissuesofmammalsthroughensembledeeplearning AT qiuwangren predictingn6methyladenosinesitesinmultipletissuesofmammalsthroughensembledeeplearning AT xuzhaochun predictingn6methyladenosinesitesinmultipletissuesofmammalsthroughensembledeeplearning AT xiaoxuan predictingn6methyladenosinesitesinmultipletissuesofmammalsthroughensembledeeplearning

Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning

Ejemplares similares