Cargando…

An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding

BACKGROUND: The basic RNA secondary structure prediction problem or single sequence folding problem (SSF) was solved 35 years ago by a now well-known [Formula: see text] -time dynamic programming method. Recently three methodologies—Valiant, Four-Russians, and Sparsification—have been applied to spe...

Descripción completa

Detalles Bibliográficos
Autores principales: Frid, Yelena, Gusfield, Dan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4974819/
https://www.ncbi.nlm.nih.gov/pubmed/27499801
http://dx.doi.org/10.1186/s13015-016-0081-9
_version_ 1782446617678315520
author Frid, Yelena
Gusfield, Dan
author_facet Frid, Yelena
Gusfield, Dan
author_sort Frid, Yelena
collection PubMed
description BACKGROUND: The basic RNA secondary structure prediction problem or single sequence folding problem (SSF) was solved 35 years ago by a now well-known [Formula: see text] -time dynamic programming method. Recently three methodologies—Valiant, Four-Russians, and Sparsification—have been applied to speedup RNA secondary structure prediction. The sparsification method exploits two properties of the input: the number of subsequence Z with the endpoints belonging to the optimal folding set and the maximum number base-pairs L. These sparsity properties satisfy [Formula: see text] and [Formula: see text] , and the method reduces the algorithmic running time to O(LZ). While the Four-Russians method utilizes tabling partial results. RESULTS: In this paper, we explore three different algorithmic speedups. We first expand the reformulate the single sequence folding Four-Russians [Formula: see text] -time algorithm, to utilize an on-demand lookup table. Second, we create a framework that combines the fastest Sparsification and new fastest on-demand Four-Russians methods. This combined method has worst-case running time of [Formula: see text] , where [Formula: see text] and [Formula: see text] . Third we update the Four-Russians formulation to achieve an on-demand [Formula: see text] -time parallel algorithm. This then leads to an asymptotic speedup of [Formula: see text] where [Formula: see text] and [Formula: see text] the number of subsequence with the endpoint j belonging to the optimal folding set. CONCLUSIONS: The on-demand formulation not only removes all extraneous computation and allows us to incorporate more realistic scoring schemes, but leads us to take advantage of the sparsity properties. Through asymptotic analysis and empirical testing on the base-pair maximization variant and a more biologically informative scoring scheme, we show that this Sparse Four-Russians framework is able to achieve a speedup on every problem instance, that is asymptotically never worse, and empirically better than achieved by the minimum of the two methods alone.
format Online
Article
Text
id pubmed-4974819
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49748192016-08-06 An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding Frid, Yelena Gusfield, Dan Algorithms Mol Biol Research BACKGROUND: The basic RNA secondary structure prediction problem or single sequence folding problem (SSF) was solved 35 years ago by a now well-known [Formula: see text] -time dynamic programming method. Recently three methodologies—Valiant, Four-Russians, and Sparsification—have been applied to speedup RNA secondary structure prediction. The sparsification method exploits two properties of the input: the number of subsequence Z with the endpoints belonging to the optimal folding set and the maximum number base-pairs L. These sparsity properties satisfy [Formula: see text] and [Formula: see text] , and the method reduces the algorithmic running time to O(LZ). While the Four-Russians method utilizes tabling partial results. RESULTS: In this paper, we explore three different algorithmic speedups. We first expand the reformulate the single sequence folding Four-Russians [Formula: see text] -time algorithm, to utilize an on-demand lookup table. Second, we create a framework that combines the fastest Sparsification and new fastest on-demand Four-Russians methods. This combined method has worst-case running time of [Formula: see text] , where [Formula: see text] and [Formula: see text] . Third we update the Four-Russians formulation to achieve an on-demand [Formula: see text] -time parallel algorithm. This then leads to an asymptotic speedup of [Formula: see text] where [Formula: see text] and [Formula: see text] the number of subsequence with the endpoint j belonging to the optimal folding set. CONCLUSIONS: The on-demand formulation not only removes all extraneous computation and allows us to incorporate more realistic scoring schemes, but leads us to take advantage of the sparsity properties. Through asymptotic analysis and empirical testing on the base-pair maximization variant and a more biologically informative scoring scheme, we show that this Sparse Four-Russians framework is able to achieve a speedup on every problem instance, that is asymptotically never worse, and empirically better than achieved by the minimum of the two methods alone. BioMed Central 2016-08-05 /pmc/articles/PMC4974819/ /pubmed/27499801 http://dx.doi.org/10.1186/s13015-016-0081-9 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Frid, Yelena
Gusfield, Dan
An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding
title An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding
title_full An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding
title_fullStr An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding
title_full_unstemmed An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding
title_short An improved Four-Russians method and sparsified Four-Russians algorithm for RNA folding
title_sort improved four-russians method and sparsified four-russians algorithm for rna folding
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4974819/
https://www.ncbi.nlm.nih.gov/pubmed/27499801
http://dx.doi.org/10.1186/s13015-016-0081-9
work_keys_str_mv AT fridyelena animprovedfourrussiansmethodandsparsifiedfourrussiansalgorithmforrnafolding
AT gusfielddan animprovedfourrussiansmethodandsparsifiedfourrussiansalgorithmforrnafolding
AT fridyelena improvedfourrussiansmethodandsparsifiedfourrussiansalgorithmforrnafolding
AT gusfielddan improvedfourrussiansmethodandsparsifiedfourrussiansalgorithmforrnafolding