Cargando…

Faster algorithms for RNA-folding using the Four-Russians method

BACKGROUND: The secondary structure that maximizes the number of non-crossing matchings between complimentary bases of an RNA sequence of length n can be computed in O(n(3)) time using Nussinov’s dynamic programming algorithm. The Four-Russians method is a technique that reduces the running time for...

Descripción completa

Detalles Bibliográficos
Autores principales: Venkatachalam, Balaji, Gusfield, Dan, Frid, Yelena
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3996002/
https://www.ncbi.nlm.nih.gov/pubmed/24602450
http://dx.doi.org/10.1186/1748-7188-9-5
_version_ 1782312969861857280
author Venkatachalam, Balaji
Gusfield, Dan
Frid, Yelena
author_facet Venkatachalam, Balaji
Gusfield, Dan
Frid, Yelena
author_sort Venkatachalam, Balaji
collection PubMed
description BACKGROUND: The secondary structure that maximizes the number of non-crossing matchings between complimentary bases of an RNA sequence of length n can be computed in O(n(3)) time using Nussinov’s dynamic programming algorithm. The Four-Russians method is a technique that reduces the running time for certain dynamic programming algorithms by a multiplicative factor after a preprocessing step where solutions to all smaller subproblems of a fixed size are exhaustively enumerated and solved. Frid and Gusfield designed an [Formula: see text] algorithm for RNA folding using the Four-Russians technique. In their algorithm the preprocessing is interleaved with the algorithm computation. THEORETICAL RESULTS: We simplify the algorithm and the analysis by doing the preprocessing once prior to the algorithm computation. We call this the two-vector method. We also show variants where instead of exhaustive preprocessing, we only solve the subproblems encountered in the main algorithm once and memoize the results. We give a simple proof of correctness and explore the practical advantages over the earlier method. The Nussinov algorithm admits an O(n(2)) time parallel algorithm. We show a parallel algorithm using the two-vector idea that improves the time bound to [Formula: see text]. PRACTICAL RESULTS: We have implemented the parallel algorithm on graphics processing units using the CUDA platform. We discuss the organization of the data structures to exploit coalesced memory access for fast running times. The ideas to organize the data structures also help in improving the running time of the serial algorithms. For sequences of length up to 6000 bases the parallel algorithm takes only about 2.5 seconds and the two-vector serial method takes about 57 seconds on a desktop and 15 seconds on a server. Among the serial algorithms, the two-vector and memoized versions are faster than the Frid-Gusfield algorithm by a factor of 3, and are faster than Nussinov by up to a factor of 20. The source-code for the algorithms is available at http://github.com/ijalabv/FourRussiansRNAFolding.
format Online
Article
Text
id pubmed-3996002
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39960022014-05-07 Faster algorithms for RNA-folding using the Four-Russians method Venkatachalam, Balaji Gusfield, Dan Frid, Yelena Algorithms Mol Biol Research BACKGROUND: The secondary structure that maximizes the number of non-crossing matchings between complimentary bases of an RNA sequence of length n can be computed in O(n(3)) time using Nussinov’s dynamic programming algorithm. The Four-Russians method is a technique that reduces the running time for certain dynamic programming algorithms by a multiplicative factor after a preprocessing step where solutions to all smaller subproblems of a fixed size are exhaustively enumerated and solved. Frid and Gusfield designed an [Formula: see text] algorithm for RNA folding using the Four-Russians technique. In their algorithm the preprocessing is interleaved with the algorithm computation. THEORETICAL RESULTS: We simplify the algorithm and the analysis by doing the preprocessing once prior to the algorithm computation. We call this the two-vector method. We also show variants where instead of exhaustive preprocessing, we only solve the subproblems encountered in the main algorithm once and memoize the results. We give a simple proof of correctness and explore the practical advantages over the earlier method. The Nussinov algorithm admits an O(n(2)) time parallel algorithm. We show a parallel algorithm using the two-vector idea that improves the time bound to [Formula: see text]. PRACTICAL RESULTS: We have implemented the parallel algorithm on graphics processing units using the CUDA platform. We discuss the organization of the data structures to exploit coalesced memory access for fast running times. The ideas to organize the data structures also help in improving the running time of the serial algorithms. For sequences of length up to 6000 bases the parallel algorithm takes only about 2.5 seconds and the two-vector serial method takes about 57 seconds on a desktop and 15 seconds on a server. Among the serial algorithms, the two-vector and memoized versions are faster than the Frid-Gusfield algorithm by a factor of 3, and are faster than Nussinov by up to a factor of 20. The source-code for the algorithms is available at http://github.com/ijalabv/FourRussiansRNAFolding. BioMed Central 2014-03-06 /pmc/articles/PMC3996002/ /pubmed/24602450 http://dx.doi.org/10.1186/1748-7188-9-5 Text en Copyright © 2014 Venkatachalam et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Venkatachalam, Balaji
Gusfield, Dan
Frid, Yelena
Faster algorithms for RNA-folding using the Four-Russians method
title Faster algorithms for RNA-folding using the Four-Russians method
title_full Faster algorithms for RNA-folding using the Four-Russians method
title_fullStr Faster algorithms for RNA-folding using the Four-Russians method
title_full_unstemmed Faster algorithms for RNA-folding using the Four-Russians method
title_short Faster algorithms for RNA-folding using the Four-Russians method
title_sort faster algorithms for rna-folding using the four-russians method
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3996002/
https://www.ncbi.nlm.nih.gov/pubmed/24602450
http://dx.doi.org/10.1186/1748-7188-9-5
work_keys_str_mv AT venkatachalambalaji fasteralgorithmsforrnafoldingusingthefourrussiansmethod
AT gusfielddan fasteralgorithmsforrnafoldingusingthefourrussiansmethod
AT fridyelena fasteralgorithmsforrnafoldingusingthefourrussiansmethod