Cargando…

Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains

MOTIVATION: Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solves the theoretical problem exactly with a complexity of [Formula: see text]...

Descripción completa

Detalles Bibliográficos
Autores principales: Miladi, Milad, Raden, Martin, Will, Sebastian, Backofen, Rolf
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7666477/
https://www.ncbi.nlm.nih.gov/pubmed/33292340
http://dx.doi.org/10.1186/s13015-020-00179-w
_version_ 1783610135572119552
author Miladi, Milad
Raden, Martin
Will, Sebastian
Backofen, Rolf
author_facet Miladi, Milad
Raden, Martin
Will, Sebastian
Backofen, Rolf
author_sort Miladi, Milad
collection PubMed
description MOTIVATION: Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solves the theoretical problem exactly with a complexity of [Formula: see text] in the full energy model. Over the last two decades, several variants and improvements of the Sankoff algorithm have been proposed to reduce its extreme complexity by proposing simplified energy models or imposing restrictions on the predicted alignments. RESULTS: Here, we introduce a novel variant of Sankoff’s algorithm that reconciles the simplifications of PMcomp, namely moving from the full energy model to a simpler base pair-based model, with the accuracy of the loop-based full energy model. Instead of estimating pseudo-energies from unconditional base pair probabilities, our model calculates energies from conditional base pair probabilities that allow to accurately capture structure probabilities, which obey a conditional dependency. This model gives rise to the fast and highly accurate novel algorithm Pankov (Probabilistic Sankoff-like simultaneous alignment and folding of RNAs inspired by Markov chains). CONCLUSIONS: Pankov benefits from the speed-up of excluding unreliable base-pairing without compromising the loop-based free energy model of the Sankoff’s algorithm. We show that Pankov outperforms its predecessors LocARNA and SPARSE in folding quality and is faster than LocARNA.
format Online
Article
Text
id pubmed-7666477
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76664772020-11-16 Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains Miladi, Milad Raden, Martin Will, Sebastian Backofen, Rolf Algorithms Mol Biol Research MOTIVATION: Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solves the theoretical problem exactly with a complexity of [Formula: see text] in the full energy model. Over the last two decades, several variants and improvements of the Sankoff algorithm have been proposed to reduce its extreme complexity by proposing simplified energy models or imposing restrictions on the predicted alignments. RESULTS: Here, we introduce a novel variant of Sankoff’s algorithm that reconciles the simplifications of PMcomp, namely moving from the full energy model to a simpler base pair-based model, with the accuracy of the loop-based full energy model. Instead of estimating pseudo-energies from unconditional base pair probabilities, our model calculates energies from conditional base pair probabilities that allow to accurately capture structure probabilities, which obey a conditional dependency. This model gives rise to the fast and highly accurate novel algorithm Pankov (Probabilistic Sankoff-like simultaneous alignment and folding of RNAs inspired by Markov chains). CONCLUSIONS: Pankov benefits from the speed-up of excluding unreliable base-pairing without compromising the loop-based free energy model of the Sankoff’s algorithm. We show that Pankov outperforms its predecessors LocARNA and SPARSE in folding quality and is faster than LocARNA. BioMed Central 2020-11-13 /pmc/articles/PMC7666477/ /pubmed/33292340 http://dx.doi.org/10.1186/s13015-020-00179-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Miladi, Milad
Raden, Martin
Will, Sebastian
Backofen, Rolf
Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
title Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
title_full Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
title_fullStr Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
title_full_unstemmed Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
title_short Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
title_sort fast and accurate structure probability estimation for simultaneous alignment and folding of rnas with markov chains
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7666477/
https://www.ncbi.nlm.nih.gov/pubmed/33292340
http://dx.doi.org/10.1186/s13015-020-00179-w
work_keys_str_mv AT miladimilad fastandaccuratestructureprobabilityestimationforsimultaneousalignmentandfoldingofrnaswithmarkovchains
AT radenmartin fastandaccuratestructureprobabilityestimationforsimultaneousalignmentandfoldingofrnaswithmarkovchains
AT willsebastian fastandaccuratestructureprobabilityestimationforsimultaneousalignmentandfoldingofrnaswithmarkovchains
AT backofenrolf fastandaccuratestructureprobabilityestimationforsimultaneousalignmentandfoldingofrnaswithmarkovchains