Cargando…

Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids

Although functional RNA molecules are known to be biased in overall composition, the effects of background composition on the probability of finding a particular active site by chance has received little attention. The probability of finding a particular motif has important implications both for und...

Descripción completa

Detalles Bibliográficos
Autores principales: Knight, Rob, De Sterck, Hans, Markel, Rob, Smit, Sandra, Oshmyansky, Alexander, Yarus, Michael
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1258168/
https://www.ncbi.nlm.nih.gov/pubmed/16237127
http://dx.doi.org/10.1093/nar/gki886
_version_ 1782125852322955264
author Knight, Rob
De Sterck, Hans
Markel, Rob
Smit, Sandra
Oshmyansky, Alexander
Yarus, Michael
author_facet Knight, Rob
De Sterck, Hans
Markel, Rob
Smit, Sandra
Oshmyansky, Alexander
Yarus, Michael
author_sort Knight, Rob
collection PubMed
description Although functional RNA molecules are known to be biased in overall composition, the effects of background composition on the probability of finding a particular active site by chance has received little attention. The probability of finding a particular motif has important implications both for understanding the distribution of functional RNAs in ancient and modern organisms with varying genome compositions and for tuning SELEX pools to optimize the chance of finding specific functions. Here we develop a new method for calculating the probability of finding a modular motif containing base-paired regions, and use a computational grid to fold several hundred million random RNA sequences containing the core elements of the isoleucine aptamer and the hammerhead ribozyme to estimate the probability that a sequence containing these structural elements will fold correctly when isolated from background sequences of different compositions. We find that the two motifs are most likely to be found in distinct regions of compositional space, and that the regions of greatest abundance are influenced by the probability of finding the conserved bases, finding the flanking helices, and folding, in that order of importance. Additionally, we can refine our estimates of the number of random sequences required for a 50% probability of finding an example of each site in unbiased random pools of length 100 to 4.1 × 10(9) for the isoleucine aptamer and 1.6 × 10(10) for the hammerhead ribozyme. These figures are consistent with the facile recovery of these motifs from SELEX experiments.
format Text
id pubmed-1258168
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-12581682005-10-28 Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids Knight, Rob De Sterck, Hans Markel, Rob Smit, Sandra Oshmyansky, Alexander Yarus, Michael Nucleic Acids Res Article Although functional RNA molecules are known to be biased in overall composition, the effects of background composition on the probability of finding a particular active site by chance has received little attention. The probability of finding a particular motif has important implications both for understanding the distribution of functional RNAs in ancient and modern organisms with varying genome compositions and for tuning SELEX pools to optimize the chance of finding specific functions. Here we develop a new method for calculating the probability of finding a modular motif containing base-paired regions, and use a computational grid to fold several hundred million random RNA sequences containing the core elements of the isoleucine aptamer and the hammerhead ribozyme to estimate the probability that a sequence containing these structural elements will fold correctly when isolated from background sequences of different compositions. We find that the two motifs are most likely to be found in distinct regions of compositional space, and that the regions of greatest abundance are influenced by the probability of finding the conserved bases, finding the flanking helices, and folding, in that order of importance. Additionally, we can refine our estimates of the number of random sequences required for a 50% probability of finding an example of each site in unbiased random pools of length 100 to 4.1 × 10(9) for the isoleucine aptamer and 1.6 × 10(10) for the hammerhead ribozyme. These figures are consistent with the facile recovery of these motifs from SELEX experiments. Oxford University Press 2005 2005-10-19 /pmc/articles/PMC1258168/ /pubmed/16237127 http://dx.doi.org/10.1093/nar/gki886 Text en © The Author 2005. Published by Oxford University Press. All rights reserved
spellingShingle Article
Knight, Rob
De Sterck, Hans
Markel, Rob
Smit, Sandra
Oshmyansky, Alexander
Yarus, Michael
Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
title Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
title_full Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
title_fullStr Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
title_full_unstemmed Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
title_short Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
title_sort abundance of correctly folded rna motifs in sequence space, calculated on computational grids
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1258168/
https://www.ncbi.nlm.nih.gov/pubmed/16237127
http://dx.doi.org/10.1093/nar/gki886
work_keys_str_mv AT knightrob abundanceofcorrectlyfoldedrnamotifsinsequencespacecalculatedoncomputationalgrids
AT desterckhans abundanceofcorrectlyfoldedrnamotifsinsequencespacecalculatedoncomputationalgrids
AT markelrob abundanceofcorrectlyfoldedrnamotifsinsequencespacecalculatedoncomputationalgrids
AT smitsandra abundanceofcorrectlyfoldedrnamotifsinsequencespacecalculatedoncomputationalgrids
AT oshmyanskyalexander abundanceofcorrectlyfoldedrnamotifsinsequencespacecalculatedoncomputationalgrids
AT yarusmichael abundanceofcorrectlyfoldedrnamotifsinsequencespacecalculatedoncomputationalgrids