Cargando…
ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism
Accurate RNA secondary structure information is the cornerstone of gene function research and RNA tertiary structure prediction. However, most traditional RNA secondary structure prediction algorithms are based on the dynamic programming (DP) algorithm, according to the minimum free energy theory, w...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770172/ https://www.ncbi.nlm.nih.gov/pubmed/33384721 http://dx.doi.org/10.3389/fgene.2020.612086 |
_version_ | 1783629451081285632 |
---|---|
author | Wang, Yili Liu, Yuanning Wang, Shuo Liu, Zhen Gao, Yubing Zhang, Hao Dong, Liyan |
author_facet | Wang, Yili Liu, Yuanning Wang, Shuo Liu, Zhen Gao, Yubing Zhang, Hao Dong, Liyan |
author_sort | Wang, Yili |
collection | PubMed |
description | Accurate RNA secondary structure information is the cornerstone of gene function research and RNA tertiary structure prediction. However, most traditional RNA secondary structure prediction algorithms are based on the dynamic programming (DP) algorithm, according to the minimum free energy theory, with both hard and soft constraints. The accuracy is particularly dependent on the accuracy of soft constraints (from experimental data like chemical and enzyme detection). With the elongation of the RNA sequence, the time complexity of DP-based algorithms will increase geometrically, as a result, they are not good at coping with relatively long sequences. Furthermore, due to the complexity of the pseudoknots structure, the secondary structure prediction method, based on traditional algorithms, has great defects which cannot predict the secondary structure with pseudoknots well. Therefore, few algorithms have been available for pseudoknots prediction in the past. The ATTfold algorithm proposed in this article is a deep learning algorithm based on an attention mechanism. It analyzes the global information of the RNA sequence via the characteristics of the attention mechanism, focuses on the correlation between paired bases, and solves the problem of long sequence prediction. Moreover, this algorithm also extracts the effective multi-dimensional features from a great number of RNA sequences and structure information, by combining the exclusive hard constraints of RNA secondary structure. Hence, it accurately determines the pairing position of each base, and obtains the real and effective RNA secondary structure, including pseudoknots. Finally, after training the ATTfold algorithm model through tens of thousands of RNA sequences and their real secondary structures, this algorithm was compared with four classic RNA secondary structure prediction algorithms. The results show that our algorithm significantly outperforms others and more accurately showed the secondary structure of RNA. As the data in RNA sequence databases increase, our deep learning-based algorithm will have superior performance. In the future, this kind of algorithm will be more indispensable. |
format | Online Article Text |
id | pubmed-7770172 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-77701722020-12-30 ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism Wang, Yili Liu, Yuanning Wang, Shuo Liu, Zhen Gao, Yubing Zhang, Hao Dong, Liyan Front Genet Genetics Accurate RNA secondary structure information is the cornerstone of gene function research and RNA tertiary structure prediction. However, most traditional RNA secondary structure prediction algorithms are based on the dynamic programming (DP) algorithm, according to the minimum free energy theory, with both hard and soft constraints. The accuracy is particularly dependent on the accuracy of soft constraints (from experimental data like chemical and enzyme detection). With the elongation of the RNA sequence, the time complexity of DP-based algorithms will increase geometrically, as a result, they are not good at coping with relatively long sequences. Furthermore, due to the complexity of the pseudoknots structure, the secondary structure prediction method, based on traditional algorithms, has great defects which cannot predict the secondary structure with pseudoknots well. Therefore, few algorithms have been available for pseudoknots prediction in the past. The ATTfold algorithm proposed in this article is a deep learning algorithm based on an attention mechanism. It analyzes the global information of the RNA sequence via the characteristics of the attention mechanism, focuses on the correlation between paired bases, and solves the problem of long sequence prediction. Moreover, this algorithm also extracts the effective multi-dimensional features from a great number of RNA sequences and structure information, by combining the exclusive hard constraints of RNA secondary structure. Hence, it accurately determines the pairing position of each base, and obtains the real and effective RNA secondary structure, including pseudoknots. Finally, after training the ATTfold algorithm model through tens of thousands of RNA sequences and their real secondary structures, this algorithm was compared with four classic RNA secondary structure prediction algorithms. The results show that our algorithm significantly outperforms others and more accurately showed the secondary structure of RNA. As the data in RNA sequence databases increase, our deep learning-based algorithm will have superior performance. In the future, this kind of algorithm will be more indispensable. Frontiers Media S.A. 2020-12-15 /pmc/articles/PMC7770172/ /pubmed/33384721 http://dx.doi.org/10.3389/fgene.2020.612086 Text en Copyright © 2020 Wang, Liu, Wang, Liu, Gao, Zhang and Dong. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Wang, Yili Liu, Yuanning Wang, Shuo Liu, Zhen Gao, Yubing Zhang, Hao Dong, Liyan ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism |
title | ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism |
title_full | ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism |
title_fullStr | ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism |
title_full_unstemmed | ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism |
title_short | ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism |
title_sort | attfold: rna secondary structure prediction with pseudoknots based on attention mechanism |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770172/ https://www.ncbi.nlm.nih.gov/pubmed/33384721 http://dx.doi.org/10.3389/fgene.2020.612086 |
work_keys_str_mv | AT wangyili attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism AT liuyuanning attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism AT wangshuo attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism AT liuzhen attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism AT gaoyubing attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism AT zhanghao attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism AT dongliyan attfoldrnasecondarystructurepredictionwithpseudoknotsbasedonattentionmechanism |