Cargando…
ENTRNA: a framework to predict RNA foldability
BACKGROUND: RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6610807/ https://www.ncbi.nlm.nih.gov/pubmed/31269893 http://dx.doi.org/10.1186/s12859-019-2948-5 |
_version_ | 1783432569294946304 |
---|---|
author | Su, Congzhe Weir, Jeffery D. Zhang, Fei Yan, Hao Wu, Teresa |
author_facet | Su, Congzhe Weir, Jeffery D. Zhang, Fei Yan, Hao Wu, Teresa |
author_sort | Su, Congzhe |
collection | PubMed |
description | BACKGROUND: RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS: In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION: In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research. |
format | Online Article Text |
id | pubmed-6610807 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-66108072019-07-16 ENTRNA: a framework to predict RNA foldability Su, Congzhe Weir, Jeffery D. Zhang, Fei Yan, Hao Wu, Teresa BMC Bioinformatics Research Article BACKGROUND: RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS: In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION: In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research. BioMed Central 2019-07-03 /pmc/articles/PMC6610807/ /pubmed/31269893 http://dx.doi.org/10.1186/s12859-019-2948-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Su, Congzhe Weir, Jeffery D. Zhang, Fei Yan, Hao Wu, Teresa ENTRNA: a framework to predict RNA foldability |
title | ENTRNA: a framework to predict RNA foldability |
title_full | ENTRNA: a framework to predict RNA foldability |
title_fullStr | ENTRNA: a framework to predict RNA foldability |
title_full_unstemmed | ENTRNA: a framework to predict RNA foldability |
title_short | ENTRNA: a framework to predict RNA foldability |
title_sort | entrna: a framework to predict rna foldability |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6610807/ https://www.ncbi.nlm.nih.gov/pubmed/31269893 http://dx.doi.org/10.1186/s12859-019-2948-5 |
work_keys_str_mv | AT sucongzhe entrnaaframeworktopredictrnafoldability AT weirjefferyd entrnaaframeworktopredictrnafoldability AT zhangfei entrnaaframeworktopredictrnafoldability AT yanhao entrnaaframeworktopredictrnafoldability AT wuteresa entrnaaframeworktopredictrnafoldability |