Cargando…
NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences
Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7059790/ https://www.ncbi.nlm.nih.gov/pubmed/32180792 http://dx.doi.org/10.3389/fgene.2020.00090 |
_version_ | 1783504120071585792 |
---|---|
author | Yang, Sen Wang, Yan Zhang, Shuangquan Hu, Xuemei Ma, Qin Tian, Yuan |
author_facet | Yang, Sen Wang, Yan Zhang, Shuangquan Hu, Xuemei Ma, Qin Tian, Yuan |
author_sort | Yang, Sen |
collection | PubMed |
description | Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length > 303 nt) and short ORFs (longest ORF length ≤ 303 nt) have been discovered in a short time. How to identify ncRNAs more precisely from novel unannotated RNAs is an important step for RNA functional analysis, RNA regulation, etc. However, most previous methods only utilize the information of sequence features. Meanwhile, most of them have focused on long-ORF RNA sequences, but not adapted to short-ORF RNA sequences. In this paper, we propose a new reliable method called NCResNet. NCResNet employs 57 hybrid features of four categories as inputs, including sequence, protein, RNA structure, and RNA physicochemical properties, and introduces feature enhancement and deep feature learning policies in a neural net model to adapt to this problem. The experiments on benchmark datasets of 8 species shows NCResNet has higher accuracy and higher Matthews correlation coefficient (MCC) compared with other state-of-the-art methods. Particularly, on four short-ORF RNA sequence datasets, specifically mouse, Saccharomyces cerevisiae, zebrafish, and cow, NCResNet achieves greater than 10 and 15% improvements over other state-of-the-art methods in terms of accuracy and MCC. Meanwhile, for long-ORF RNA sequence datasets, NCResNet also has better accuracy and MCC than other state-of-the-art methods on most test datasets. Codes and data are available at https://github.com/abcair/NCResNet. |
format | Online Article Text |
id | pubmed-7059790 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-70597902020-03-16 NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences Yang, Sen Wang, Yan Zhang, Shuangquan Hu, Xuemei Ma, Qin Tian, Yuan Front Genet Genetics Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length > 303 nt) and short ORFs (longest ORF length ≤ 303 nt) have been discovered in a short time. How to identify ncRNAs more precisely from novel unannotated RNAs is an important step for RNA functional analysis, RNA regulation, etc. However, most previous methods only utilize the information of sequence features. Meanwhile, most of them have focused on long-ORF RNA sequences, but not adapted to short-ORF RNA sequences. In this paper, we propose a new reliable method called NCResNet. NCResNet employs 57 hybrid features of four categories as inputs, including sequence, protein, RNA structure, and RNA physicochemical properties, and introduces feature enhancement and deep feature learning policies in a neural net model to adapt to this problem. The experiments on benchmark datasets of 8 species shows NCResNet has higher accuracy and higher Matthews correlation coefficient (MCC) compared with other state-of-the-art methods. Particularly, on four short-ORF RNA sequence datasets, specifically mouse, Saccharomyces cerevisiae, zebrafish, and cow, NCResNet achieves greater than 10 and 15% improvements over other state-of-the-art methods in terms of accuracy and MCC. Meanwhile, for long-ORF RNA sequence datasets, NCResNet also has better accuracy and MCC than other state-of-the-art methods on most test datasets. Codes and data are available at https://github.com/abcair/NCResNet. Frontiers Media S.A. 2020-02-28 /pmc/articles/PMC7059790/ /pubmed/32180792 http://dx.doi.org/10.3389/fgene.2020.00090 Text en Copyright © 2020 Yang, Wang, Zhang, Hu, Ma and Tian http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Yang, Sen Wang, Yan Zhang, Shuangquan Hu, Xuemei Ma, Qin Tian, Yuan NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences |
title | NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences |
title_full | NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences |
title_fullStr | NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences |
title_full_unstemmed | NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences |
title_short | NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences |
title_sort | ncresnet: noncoding ribonucleic acid prediction based on a deep resident network of ribonucleic acid sequences |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7059790/ https://www.ncbi.nlm.nih.gov/pubmed/32180792 http://dx.doi.org/10.3389/fgene.2020.00090 |
work_keys_str_mv | AT yangsen ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences AT wangyan ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences AT zhangshuangquan ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences AT huxuemei ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences AT maqin ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences AT tianyuan ncresnetnoncodingribonucleicacidpredictionbasedonadeepresidentnetworkofribonucleicacidsequences |