Cargando…

RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory

BACKGROUND: RNA inverse folding is the problem of finding one or more sequences that fold into a user-specified target structure s (0), i.e. whose minimum free energy secondary structure is identical to the target s (0). Here we consider the ensemble of all RNA sequences that have low free energy wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Garcia-Martin, Juan Antonio, Bayegan, Amir H., Dotu, Ivan, Clote, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5069997/
https://www.ncbi.nlm.nih.gov/pubmed/27756204
http://dx.doi.org/10.1186/s12859-016-1280-6
_version_ 1782461051180154880
author Garcia-Martin, Juan Antonio
Bayegan, Amir H.
Dotu, Ivan
Clote, Peter
author_facet Garcia-Martin, Juan Antonio
Bayegan, Amir H.
Dotu, Ivan
Clote, Peter
author_sort Garcia-Martin, Juan Antonio
collection PubMed
description BACKGROUND: RNA inverse folding is the problem of finding one or more sequences that fold into a user-specified target structure s (0), i.e. whose minimum free energy secondary structure is identical to the target s (0). Here we consider the ensemble of all RNA sequences that have low free energy with respect to a given target s (0). RESULTS: We introduce the program RNAdualPF, which computes the dual partition function Z (∗), defined as the sum of Boltzmann factors exp(−E(a,s (0))/RT) of all RNA nucleotide sequences a compatible with target structure s (0). Using RNAdualPF, we efficiently sample RNA sequences that approximately fold into s (0), where additionally the user can specify IUPAC sequence constraints at certain positions, and whether to include dangles (energy terms for stacked, single-stranded nucleotides). Moreover, since we also compute the dual partition function Z (∗)(k) over all sequences having GC-content k, the user can require that all sampled sequences have a precise, specified GC-content. Using Z (∗), we compute the dual expected energy 〈E (∗)〉, and use it to show that natural RNAs from the Rfam 12.0 database have higher minimum free energy than expected, thus suggesting that functional RNAs are under evolutionary pressure to be only marginally thermodynamically stable. We show that C. elegans precursor microRNA (pre-miRNA) is significantly non-robust with respect to mutations, by comparing the robustness of each wild type pre-miRNA sequence with 2000 [resp. 500] sequences of the same GC-content generated by RNAdualPF, which approximately [resp. exactly] fold into the wild type target structure. We confirm and strengthen earlier findings that precursor microRNAs and bacterial small noncoding RNAs display plasticity, a measure of structural diversity. CONCLUSION: We describe RNAdualPF, which rapidly computes the dual partition function Z (∗) and samples sequences having low energy with respect to a target structure, allowing sequence constraints and specified GC-content. Using different inverse folding software, another group had earlier shown that pre-miRNA is mutationally robust, even controlling for compositional bias. Our opposite conclusion suggests a cautionary note that computationally based insights into molecular evolution may heavily depend on the software used. C/C++-software for RNAdualPF is available at http://bioinformatics.bc.edu/clotelab/RNAdualPF. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1280-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5069997
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50699972016-10-24 RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory Garcia-Martin, Juan Antonio Bayegan, Amir H. Dotu, Ivan Clote, Peter BMC Bioinformatics Software BACKGROUND: RNA inverse folding is the problem of finding one or more sequences that fold into a user-specified target structure s (0), i.e. whose minimum free energy secondary structure is identical to the target s (0). Here we consider the ensemble of all RNA sequences that have low free energy with respect to a given target s (0). RESULTS: We introduce the program RNAdualPF, which computes the dual partition function Z (∗), defined as the sum of Boltzmann factors exp(−E(a,s (0))/RT) of all RNA nucleotide sequences a compatible with target structure s (0). Using RNAdualPF, we efficiently sample RNA sequences that approximately fold into s (0), where additionally the user can specify IUPAC sequence constraints at certain positions, and whether to include dangles (energy terms for stacked, single-stranded nucleotides). Moreover, since we also compute the dual partition function Z (∗)(k) over all sequences having GC-content k, the user can require that all sampled sequences have a precise, specified GC-content. Using Z (∗), we compute the dual expected energy 〈E (∗)〉, and use it to show that natural RNAs from the Rfam 12.0 database have higher minimum free energy than expected, thus suggesting that functional RNAs are under evolutionary pressure to be only marginally thermodynamically stable. We show that C. elegans precursor microRNA (pre-miRNA) is significantly non-robust with respect to mutations, by comparing the robustness of each wild type pre-miRNA sequence with 2000 [resp. 500] sequences of the same GC-content generated by RNAdualPF, which approximately [resp. exactly] fold into the wild type target structure. We confirm and strengthen earlier findings that precursor microRNAs and bacterial small noncoding RNAs display plasticity, a measure of structural diversity. CONCLUSION: We describe RNAdualPF, which rapidly computes the dual partition function Z (∗) and samples sequences having low energy with respect to a target structure, allowing sequence constraints and specified GC-content. Using different inverse folding software, another group had earlier shown that pre-miRNA is mutationally robust, even controlling for compositional bias. Our opposite conclusion suggests a cautionary note that computationally based insights into molecular evolution may heavily depend on the software used. C/C++-software for RNAdualPF is available at http://bioinformatics.bc.edu/clotelab/RNAdualPF. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1280-6) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-19 /pmc/articles/PMC5069997/ /pubmed/27756204 http://dx.doi.org/10.1186/s12859-016-1280-6 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Garcia-Martin, Juan Antonio
Bayegan, Amir H.
Dotu, Ivan
Clote, Peter
RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory
title RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory
title_full RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory
title_fullStr RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory
title_full_unstemmed RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory
title_short RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory
title_sort rnadualpf: software to compute the dual partition function with sample applications in molecular evolution theory
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5069997/
https://www.ncbi.nlm.nih.gov/pubmed/27756204
http://dx.doi.org/10.1186/s12859-016-1280-6
work_keys_str_mv AT garciamartinjuanantonio rnadualpfsoftwaretocomputethedualpartitionfunctionwithsampleapplicationsinmolecularevolutiontheory
AT bayeganamirh rnadualpfsoftwaretocomputethedualpartitionfunctionwithsampleapplicationsinmolecularevolutiontheory
AT dotuivan rnadualpfsoftwaretocomputethedualpartitionfunctionwithsampleapplicationsinmolecularevolutiontheory
AT clotepeter rnadualpfsoftwaretocomputethedualpartitionfunctionwithsampleapplicationsinmolecularevolutiontheory