Cargando…

NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences

Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Sen, Wang, Yan, Zhang, Shuangquan, Hu, Xuemei, Ma, Qin, Tian, Yuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7059790/
https://www.ncbi.nlm.nih.gov/pubmed/32180792
http://dx.doi.org/10.3389/fgene.2020.00090
_version_ 1783504120071585792
author Yang, Sen
Wang, Yan
Zhang, Shuangquan
Hu, Xuemei
Ma, Qin
Tian, Yuan
author_facet Yang, Sen
Wang, Yan
Zhang, Shuangquan
Hu, Xuemei
Ma, Qin
Tian, Yuan
author_sort Yang, Sen
collection PubMed
description Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length > 303 nt) and short ORFs (longest ORF length ≤ 303 nt) have been discovered in a short time. How to identify ncRNAs more precisely from novel unannotated RNAs is an important step for RNA functional analysis, RNA regulation, etc. However, most previous methods only utilize the information of sequence features. Meanwhile, most of them have focused on long-ORF RNA sequences, but not adapted to short-ORF RNA sequences. In this paper, we propose a new reliable method called NCResNet. NCResNet employs 57 hybrid features of four categories as inputs, including sequence, protein, RNA structure, and RNA physicochemical properties, and introduces feature enhancement and deep feature learning policies in a neural net model to adapt to this problem. The experiments on benchmark datasets of 8 species shows NCResNet has higher accuracy and higher Matthews correlation coefficient (MCC) compared with other state-of-the-art methods. Particularly, on four short-ORF RNA sequence datasets, specifically mouse, Saccharomyces cerevisiae, zebrafish, and cow, NCResNet achieves greater than 10 and 15% improvements over other state-of-the-art methods in terms of accuracy and MCC. Meanwhile, for long-ORF RNA sequence datasets, NCResNet also has better accuracy and MCC than other state-of-the-art methods on most test datasets. Codes and data are available at https://github.com/abcair/NCResNet.
format Online
Article
Text
id pubmed-7059790
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70597902020-03-16 NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences Yang, Sen Wang, Yan Zhang, Shuangquan Hu, Xuemei Ma, Qin Tian, Yuan Front Genet Genetics Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length > 303 nt) and short ORFs (longest ORF length ≤ 303 nt) have been discovered in a short time. How to identify ncRNAs more precisely from novel unannotated RNAs is an important step for RNA functional analysis, RNA regulation, etc. However, most previous methods only utilize the information of sequence features. Meanwhile, most of them have focused on long-ORF RNA sequences, but not adapted to short-ORF RNA sequences. In this paper, we propose a new reliable method called NCResNet. NCResNet employs 57 hybrid features of four categories as inputs, including sequence, protein, RNA structure, and RNA physicochemical properties, and introduces feature enhancement and deep feature learning policies in a neural net model to adapt to this problem. The experiments on benchmark datasets of 8 species shows NCResNet has higher accuracy and higher Matthews correlation coefficient (MCC) compared with other state-of-the-art methods. Particularly, on four short-ORF RNA sequence datasets, specifically mouse, Saccharomyces cerevisiae, zebrafish, and cow, NCResNet achieves greater than 10 and 15% improvements over other state-of-the-art methods in terms of accuracy and MCC. Meanwhile, for long-ORF RNA sequence datasets, NCResNet also has better accuracy and MCC than other state-of-the-art methods on most test datasets. Codes and data are available at https://github.com/abcair/NCResNet. Frontiers Media S.A. 2020-02-28 /pmc/articles/PMC7059790/ /pubmed/32180792 http://dx.doi.org/10.3389/fgene.2020.00090 Text en Copyright © 2020 Yang, Wang, Zhang, Hu, Ma and Tian http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yang, Sen
Wang, Yan
Zhang, Shuangquan
Hu, Xuemei
Ma, Qin
Tian, Yuan
NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
title NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
title_full NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
title_fullStr NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
title_full_unstemmed NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
title_short NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
title_sort ncresnet: noncoding ribonucleic acid prediction based on a deep resident network of ribonucleic acid sequences
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7059790/
https://www.ncbi.nlm.nih.gov/pubmed/32180792
http://dx.doi.org/10.3389/fgene.2020.00090
work_keys_str_mv AT yangsen ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences
AT wangyan ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences
AT zhangshuangquan ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences
AT huxuemei ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences
AT maqin ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences
AT tianyuan ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences