Cargando…

DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion

As a prevalent existing post-transcriptional modification of RNA, N6-methyladenosine (m6A) plays a crucial role in various biological processes. To better radically reveal its regulatory mechanism and provide new insights for drug design, the accurate identification of m6A sites in genome-wide is vi...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lu, Qin, Xinyi, Liu, Min, Xu, Ziwei, Liu, Guangzhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7997228/
https://www.ncbi.nlm.nih.gov/pubmed/33670877
http://dx.doi.org/10.3390/genes12030354
_version_ 1783670280562933760
author Zhang, Lu
Qin, Xinyi
Liu, Min
Xu, Ziwei
Liu, Guangzhong
author_facet Zhang, Lu
Qin, Xinyi
Liu, Min
Xu, Ziwei
Liu, Guangzhong
author_sort Zhang, Lu
collection PubMed
description As a prevalent existing post-transcriptional modification of RNA, N6-methyladenosine (m6A) plays a crucial role in various biological processes. To better radically reveal its regulatory mechanism and provide new insights for drug design, the accurate identification of m6A sites in genome-wide is vital. As the traditional experimental methods are time-consuming and cost-prohibitive, it is necessary to design a more efficient computational method to detect the m6A sites. In this study, we propose a novel cross-species computational method DNN-m6A based on the deep neural network (DNN) to identify m6A sites in multiple tissues of human, mouse and rat. Firstly, binary encoding (BE), tri-nucleotide composition (TNC), enhanced nucleic acid composition (ENAC), K-spaced nucleotide pair frequencies (KSNPFs), nucleotide chemical property (NCP), pseudo dinucleotide composition (PseDNC), position-specific nucleotide propensity (PSNP) and position-specific dinucleotide propensity (PSDP) are employed to extract RNA sequence features which are subsequently fused to construct the initial feature vector set. Secondly, we use elastic net to eliminate redundant features while building the optimal feature subset. Finally, the hyper-parameters of DNN are tuned with Bayesian hyper-parameter optimization based on the selected feature subset. The five-fold cross-validation test on training datasets show that the proposed DNN-m6A method outperformed the state-of-the-art method for predicting m6A sites, with an accuracy (ACC) of 73.58–83.38% and an area under the curve (AUC) of 81.39–91.04%. Furthermore, the independent datasets achieved an ACC of 72.95–83.04% and an AUC of 80.79–91.09%, which shows an excellent generalization ability of our proposed method.
format Online
Article
Text
id pubmed-7997228
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79972282021-03-27 DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion Zhang, Lu Qin, Xinyi Liu, Min Xu, Ziwei Liu, Guangzhong Genes (Basel) Article As a prevalent existing post-transcriptional modification of RNA, N6-methyladenosine (m6A) plays a crucial role in various biological processes. To better radically reveal its regulatory mechanism and provide new insights for drug design, the accurate identification of m6A sites in genome-wide is vital. As the traditional experimental methods are time-consuming and cost-prohibitive, it is necessary to design a more efficient computational method to detect the m6A sites. In this study, we propose a novel cross-species computational method DNN-m6A based on the deep neural network (DNN) to identify m6A sites in multiple tissues of human, mouse and rat. Firstly, binary encoding (BE), tri-nucleotide composition (TNC), enhanced nucleic acid composition (ENAC), K-spaced nucleotide pair frequencies (KSNPFs), nucleotide chemical property (NCP), pseudo dinucleotide composition (PseDNC), position-specific nucleotide propensity (PSNP) and position-specific dinucleotide propensity (PSDP) are employed to extract RNA sequence features which are subsequently fused to construct the initial feature vector set. Secondly, we use elastic net to eliminate redundant features while building the optimal feature subset. Finally, the hyper-parameters of DNN are tuned with Bayesian hyper-parameter optimization based on the selected feature subset. The five-fold cross-validation test on training datasets show that the proposed DNN-m6A method outperformed the state-of-the-art method for predicting m6A sites, with an accuracy (ACC) of 73.58–83.38% and an area under the curve (AUC) of 81.39–91.04%. Furthermore, the independent datasets achieved an ACC of 72.95–83.04% and an AUC of 80.79–91.09%, which shows an excellent generalization ability of our proposed method. MDPI 2021-02-28 /pmc/articles/PMC7997228/ /pubmed/33670877 http://dx.doi.org/10.3390/genes12030354 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ).
spellingShingle Article
Zhang, Lu
Qin, Xinyi
Liu, Min
Xu, Ziwei
Liu, Guangzhong
DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion
title DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion
title_full DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion
title_fullStr DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion
title_full_unstemmed DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion
title_short DNN-m6A: A Cross-Species Method for Identifying RNA N6-methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion
title_sort dnn-m6a: a cross-species method for identifying rna n6-methyladenosine sites based on deep neural network with multi-information fusion
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7997228/
https://www.ncbi.nlm.nih.gov/pubmed/33670877
http://dx.doi.org/10.3390/genes12030354
work_keys_str_mv AT zhanglu dnnm6aacrossspeciesmethodforidentifyingrnan6methyladenosinesitesbasedondeepneuralnetworkwithmultiinformationfusion
AT qinxinyi dnnm6aacrossspeciesmethodforidentifyingrnan6methyladenosinesitesbasedondeepneuralnetworkwithmultiinformationfusion
AT liumin dnnm6aacrossspeciesmethodforidentifyingrnan6methyladenosinesitesbasedondeepneuralnetworkwithmultiinformationfusion
AT xuziwei dnnm6aacrossspeciesmethodforidentifyingrnan6methyladenosinesitesbasedondeepneuralnetworkwithmultiinformationfusion
AT liuguangzhong dnnm6aacrossspeciesmethodforidentifyingrnan6methyladenosinesitesbasedondeepneuralnetworkwithmultiinformationfusion