Cargando…

Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Dowell, Robin D, Eddy, Sean R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1579236/
https://www.ncbi.nlm.nih.gov/pubmed/16952317
http://dx.doi.org/10.1186/1471-2105-7-400
_version_ 1782130316970819584
author Dowell, Robin D
Eddy, Sean R
author_facet Dowell, Robin D
Eddy, Sean R
author_sort Dowell, Robin D
collection PubMed
description BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses.
format Text
id pubmed-1579236
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15792362006-10-02 Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints Dowell, Robin D Eddy, Sean R BMC Bioinformatics Methodology Article BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses. BioMed Central 2006-09-04 /pmc/articles/PMC1579236/ /pubmed/16952317 http://dx.doi.org/10.1186/1471-2105-7-400 Text en Copyright © 2006 Dowell and Eddy; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Dowell, Robin D
Eddy, Sean R
Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints
title Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints
title_full Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints
title_fullStr Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints
title_full_unstemmed Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints
title_short Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints
title_sort efficient pairwise rna structure prediction and alignment using sequence alignment constraints
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1579236/
https://www.ncbi.nlm.nih.gov/pubmed/16952317
http://dx.doi.org/10.1186/1471-2105-7-400
work_keys_str_mv AT dowellrobind efficientpairwisernastructurepredictionandalignmentusingsequencealignmentconstraints
AT eddyseanr efficientpairwisernastructurepredictionandalignmentusingsequencealignmentconstraints