Cargando…

DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle

While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseu...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Linyu, Liu, Yuanning, Zhong, Xiaodan, Liu, Haiming, Lu, Chao, Li, Cong, Zhang, Hao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409321/
https://www.ncbi.nlm.nih.gov/pubmed/30886627
http://dx.doi.org/10.3389/fgene.2019.00143
_version_ 1783401939485065216
author Wang, Linyu
Liu, Yuanning
Zhong, Xiaodan
Liu, Haiming
Lu, Chao
Li, Cong
Zhang, Hao
author_facet Wang, Linyu
Liu, Yuanning
Zhong, Xiaodan
Liu, Haiming
Lu, Chao
Li, Cong
Zhang, Hao
author_sort Wang, Linyu
collection PubMed
description While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, “DMfold,” a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.
format Online
Article
Text
id pubmed-6409321
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-64093212019-03-18 DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle Wang, Linyu Liu, Yuanning Zhong, Xiaodan Liu, Haiming Lu, Chao Li, Cong Zhang, Hao Front Genet Genetics While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, “DMfold,” a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database. Frontiers Media S.A. 2019-03-04 /pmc/articles/PMC6409321/ /pubmed/30886627 http://dx.doi.org/10.3389/fgene.2019.00143 Text en Copyright © 2019 Wang, Liu, Zhong, Liu, Lu, Li and Zhang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wang, Linyu
Liu, Yuanning
Zhong, Xiaodan
Liu, Haiming
Lu, Chao
Li, Cong
Zhang, Hao
DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_full DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_fullStr DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_full_unstemmed DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_short DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
title_sort dmfold: a novel method to predict rna secondary structure with pseudoknots based on deep learning and improved base pair maximization principle
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409321/
https://www.ncbi.nlm.nih.gov/pubmed/30886627
http://dx.doi.org/10.3389/fgene.2019.00143
work_keys_str_mv AT wanglinyu dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT liuyuanning dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT zhongxiaodan dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT liuhaiming dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT luchao dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT licong dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple
AT zhanghao dmfoldanovelmethodtopredictrnasecondarystructurewithpseudoknotsbasedondeeplearningandimprovedbasepairmaximizationprinciple