Cargando…

ENTRNA: a framework to predict RNA foldability

BACKGROUND: RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Congzhe, Weir, Jeffery D., Zhang, Fei, Yan, Hao, Wu, Teresa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6610807/
https://www.ncbi.nlm.nih.gov/pubmed/31269893
http://dx.doi.org/10.1186/s12859-019-2948-5
_version_ 1783432569294946304
author Su, Congzhe
Weir, Jeffery D.
Zhang, Fei
Yan, Hao
Wu, Teresa
author_facet Su, Congzhe
Weir, Jeffery D.
Zhang, Fei
Yan, Hao
Wu, Teresa
author_sort Su, Congzhe
collection PubMed
description BACKGROUND: RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS: In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION: In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research.
format Online
Article
Text
id pubmed-6610807
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66108072019-07-16 ENTRNA: a framework to predict RNA foldability Su, Congzhe Weir, Jeffery D. Zhang, Fei Yan, Hao Wu, Teresa BMC Bioinformatics Research Article BACKGROUND: RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS: In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION: In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research. BioMed Central 2019-07-03 /pmc/articles/PMC6610807/ /pubmed/31269893 http://dx.doi.org/10.1186/s12859-019-2948-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Su, Congzhe
Weir, Jeffery D.
Zhang, Fei
Yan, Hao
Wu, Teresa
ENTRNA: a framework to predict RNA foldability
title ENTRNA: a framework to predict RNA foldability
title_full ENTRNA: a framework to predict RNA foldability
title_fullStr ENTRNA: a framework to predict RNA foldability
title_full_unstemmed ENTRNA: a framework to predict RNA foldability
title_short ENTRNA: a framework to predict RNA foldability
title_sort entrna: a framework to predict rna foldability
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6610807/
https://www.ncbi.nlm.nih.gov/pubmed/31269893
http://dx.doi.org/10.1186/s12859-019-2948-5
work_keys_str_mv AT sucongzhe entrnaaframeworktopredictrnafoldability
AT weirjefferyd entrnaaframeworktopredictrnafoldability
AT zhangfei entrnaaframeworktopredictrnafoldability
AT yanhao entrnaaframeworktopredictrnafoldability
AT wuteresa entrnaaframeworktopredictrnafoldability