Cargando…
Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts
An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10054693/ https://www.ncbi.nlm.nih.gov/pubmed/36983865 http://dx.doi.org/10.3390/life13030708 |
_version_ | 1785015732476575744 |
---|---|
author | Ghaddar, Fatme Dingle, Kamaludin |
author_facet | Ghaddar, Fatme Dingle, Kamaludin |
author_sort | Ghaddar, Fatme |
collection | PubMed |
description | An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary structure (SS) can be analyzed in detail mathematically and computationally, is biologically relevant, and a wealth of bioinformatic data are available, it offers a good model system for studying the role of bias. For quite short RNA (length [Formula: see text]), it has recently been shown that natural and random RNA types are structurally very similar, suggesting that bias strongly constrains evolutionary dynamics. Here, we extend these results with emphasis on much larger RNA with lengths up to 3000 nucleotides. By examining both abstract shapes and structural motif frequencies (i.e., the number of helices, bonds, bulges, junctions, and loops), we find that large natural and random structures are also very similar, especially when contrasted to typical structures sampled from the spaces of all possible RNA structures. Our motif frequency study yields another result, where the frequencies of different motifs can be used in machine learning algorithms to classify random and natural RNA with high accuracy, especially for longer RNA (e.g., ROC AUC 0.86 for L = 1000). The most important motifs for classification are the number of bulges, loops, and bonds. This finding may be useful in using SS to detect candidates for functional RNA within ‘junk’ DNA regions. |
format | Online Article Text |
id | pubmed-10054693 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100546932023-03-30 Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts Ghaddar, Fatme Dingle, Kamaludin Life (Basel) Article An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary structure (SS) can be analyzed in detail mathematically and computationally, is biologically relevant, and a wealth of bioinformatic data are available, it offers a good model system for studying the role of bias. For quite short RNA (length [Formula: see text]), it has recently been shown that natural and random RNA types are structurally very similar, suggesting that bias strongly constrains evolutionary dynamics. Here, we extend these results with emphasis on much larger RNA with lengths up to 3000 nucleotides. By examining both abstract shapes and structural motif frequencies (i.e., the number of helices, bonds, bulges, junctions, and loops), we find that large natural and random structures are also very similar, especially when contrasted to typical structures sampled from the spaces of all possible RNA structures. Our motif frequency study yields another result, where the frequencies of different motifs can be used in machine learning algorithms to classify random and natural RNA with high accuracy, especially for longer RNA (e.g., ROC AUC 0.86 for L = 1000). The most important motifs for classification are the number of bulges, loops, and bonds. This finding may be useful in using SS to detect candidates for functional RNA within ‘junk’ DNA regions. MDPI 2023-03-06 /pmc/articles/PMC10054693/ /pubmed/36983865 http://dx.doi.org/10.3390/life13030708 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Ghaddar, Fatme Dingle, Kamaludin Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts |
title | Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts |
title_full | Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts |
title_fullStr | Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts |
title_full_unstemmed | Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts |
title_short | Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts |
title_sort | random and natural non-coding rna have similar structural motif patterns but differ in bulge, loop, and bond counts |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10054693/ https://www.ncbi.nlm.nih.gov/pubmed/36983865 http://dx.doi.org/10.3390/life13030708 |
work_keys_str_mv | AT ghaddarfatme randomandnaturalnoncodingrnahavesimilarstructuralmotifpatternsbutdifferinbulgeloopandbondcounts AT dinglekamaludin randomandnaturalnoncodingrnahavesimilarstructuralmotifpatternsbutdifferinbulgeloopandbondcounts |