Cargando…
Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs a...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191989/ https://www.ncbi.nlm.nih.gov/pubmed/34048427 http://dx.doi.org/10.1371/journal.pcbi.1008990 |
_version_ | 1783705963220434944 |
---|---|
author | Soulé, Antoine Reinharz, Vladimir Sarrazin-Gendron, Roman Denise, Alain Waldispühl, Jérôme |
author_facet | Soulé, Antoine Reinharz, Vladimir Sarrazin-Gendron, Roman Denise, Alain Waldispühl, Jérôme |
author_sort | Soulé, Antoine |
collection | PubMed |
description | RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa. |
format | Online Article Text |
id | pubmed-8191989 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-81919892021-06-10 Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs Soulé, Antoine Reinharz, Vladimir Sarrazin-Gendron, Roman Denise, Alain Waldispühl, Jérôme PLoS Comput Biol Research Article RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa. Public Library of Science 2021-05-28 /pmc/articles/PMC8191989/ /pubmed/34048427 http://dx.doi.org/10.1371/journal.pcbi.1008990 Text en © 2021 Soulé et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Soulé, Antoine Reinharz, Vladimir Sarrazin-Gendron, Roman Denise, Alain Waldispühl, Jérôme Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs |
title | Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs |
title_full | Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs |
title_fullStr | Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs |
title_full_unstemmed | Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs |
title_short | Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs |
title_sort | finding recurrent rna structural networks with fast maximal common subgraphs of edge-colored graphs |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191989/ https://www.ncbi.nlm.nih.gov/pubmed/34048427 http://dx.doi.org/10.1371/journal.pcbi.1008990 |
work_keys_str_mv | AT souleantoine findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs AT reinharzvladimir findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs AT sarrazingendronroman findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs AT denisealain findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs AT waldispuhljerome findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs |