Cargando…
A max-margin model for efficient simultaneous alignment and folding of RNA sequences
Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718655/ https://www.ncbi.nlm.nih.gov/pubmed/18586747 http://dx.doi.org/10.1093/bioinformatics/btn177 |
_version_ | 1782170009326321664 |
---|---|
author | Do, Chuong B. Foo, Chuan-Sheng Batzoglou, Serafim |
author_facet | Do, Chuong B. Foo, Chuan-Sheng Batzoglou, Serafim |
author_sort | Do, Chuong B. |
collection | PubMed |
description | Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAF's fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. Results: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies. Availability: Source code for RAF is available at:http://contra.stanford.edu/contrafold/ Contact: chuongdo@cs.stanford.edu |
format | Text |
id | pubmed-2718655 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-27186552009-07-31 A max-margin model for efficient simultaneous alignment and folding of RNA sequences Do, Chuong B. Foo, Chuan-Sheng Batzoglou, Serafim Bioinformatics Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAF's fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. Results: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies. Availability: Source code for RAF is available at:http://contra.stanford.edu/contrafold/ Contact: chuongdo@cs.stanford.edu Oxford University Press 2008-07-01 /pmc/articles/PMC2718655/ /pubmed/18586747 http://dx.doi.org/10.1093/bioinformatics/btn177 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Do, Chuong B. Foo, Chuan-Sheng Batzoglou, Serafim A max-margin model for efficient simultaneous alignment and folding of RNA sequences |
title | A max-margin model for efficient simultaneous alignment and folding of RNA sequences |
title_full | A max-margin model for efficient simultaneous alignment and folding of RNA sequences |
title_fullStr | A max-margin model for efficient simultaneous alignment and folding of RNA sequences |
title_full_unstemmed | A max-margin model for efficient simultaneous alignment and folding of RNA sequences |
title_short | A max-margin model for efficient simultaneous alignment and folding of RNA sequences |
title_sort | max-margin model for efficient simultaneous alignment and folding of rna sequences |
topic | Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718655/ https://www.ncbi.nlm.nih.gov/pubmed/18586747 http://dx.doi.org/10.1093/bioinformatics/btn177 |
work_keys_str_mv | AT dochuongb amaxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences AT foochuansheng amaxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences AT batzoglouserafim amaxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences AT dochuongb maxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences AT foochuansheng maxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences AT batzoglouserafim maxmarginmodelforefficientsimultaneousalignmentandfoldingofrnasequences |