Cargando…

A max-margin model for efficient simultaneous alignment and folding of RNA sequences

Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene...

Descripción completa

Detalles Bibliográficos
Autores principales: Do, Chuong B., Foo, Chuan-Sheng, Batzoglou, Serafim
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718655/
https://www.ncbi.nlm.nih.gov/pubmed/18586747
http://dx.doi.org/10.1093/bioinformatics/btn177
_version_ 1782170009326321664
author Do, Chuong B.
Foo, Chuan-Sheng
Batzoglou, Serafim
author_facet Do, Chuong B.
Foo, Chuan-Sheng
Batzoglou, Serafim
author_sort Do, Chuong B.
collection PubMed
description Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAF's fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. Results: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies. Availability: Source code for RAF is available at:http://contra.stanford.edu/contrafold/ Contact: chuongdo@cs.stanford.edu
format Text
id pubmed-2718655
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27186552009-07-31 A max-margin model for efficient simultaneous alignment and folding of RNA sequences Do, Chuong B. Foo, Chuan-Sheng Batzoglou, Serafim Bioinformatics Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAF's fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. Results: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies. Availability: Source code for RAF is available at:http://contra.stanford.edu/contrafold/ Contact: chuongdo@cs.stanford.edu Oxford University Press 2008-07-01 /pmc/articles/PMC2718655/ /pubmed/18586747 http://dx.doi.org/10.1093/bioinformatics/btn177 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
Do, Chuong B.
Foo, Chuan-Sheng
Batzoglou, Serafim
A max-margin model for efficient simultaneous alignment and folding of RNA sequences
title A max-margin model for efficient simultaneous alignment and folding of RNA sequences
title_full A max-margin model for efficient simultaneous alignment and folding of RNA sequences
title_fullStr A max-margin model for efficient simultaneous alignment and folding of RNA sequences
title_full_unstemmed A max-margin model for efficient simultaneous alignment and folding of RNA sequences
title_short A max-margin model for efficient simultaneous alignment and folding of RNA sequences
title_sort max-margin model for efficient simultaneous alignment and folding of rna sequences
topic Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718655/
https://www.ncbi.nlm.nih.gov/pubmed/18586747
http://dx.doi.org/10.1093/bioinformatics/btn177
work_keys_str_mv AT dochuongb amaxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences
AT foochuansheng amaxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences
AT batzoglouserafim amaxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences
AT dochuongb maxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences
AT foochuansheng maxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences
AT batzoglouserafim maxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences