Cargando…

Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs

RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs a...

Descripción completa

Detalles Bibliográficos
Autores principales: Soulé, Antoine, Reinharz, Vladimir, Sarrazin-Gendron, Roman, Denise, Alain, Waldispühl, Jérôme
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191989/
https://www.ncbi.nlm.nih.gov/pubmed/34048427
http://dx.doi.org/10.1371/journal.pcbi.1008990
_version_ 1783705963220434944
author Soulé, Antoine
Reinharz, Vladimir
Sarrazin-Gendron, Roman
Denise, Alain
Waldispühl, Jérôme
author_facet Soulé, Antoine
Reinharz, Vladimir
Sarrazin-Gendron, Roman
Denise, Alain
Waldispühl, Jérôme
author_sort Soulé, Antoine
collection PubMed
description RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa.
format Online
Article
Text
id pubmed-8191989
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81919892021-06-10 Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs Soulé, Antoine Reinharz, Vladimir Sarrazin-Gendron, Roman Denise, Alain Waldispühl, Jérôme PLoS Comput Biol Research Article RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa. Public Library of Science 2021-05-28 /pmc/articles/PMC8191989/ /pubmed/34048427 http://dx.doi.org/10.1371/journal.pcbi.1008990 Text en © 2021 Soulé et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Soulé, Antoine
Reinharz, Vladimir
Sarrazin-Gendron, Roman
Denise, Alain
Waldispühl, Jérôme
Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
title Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
title_full Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
title_fullStr Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
title_full_unstemmed Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
title_short Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs
title_sort finding recurrent rna structural networks with fast maximal common subgraphs of edge-colored graphs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191989/
https://www.ncbi.nlm.nih.gov/pubmed/34048427
http://dx.doi.org/10.1371/journal.pcbi.1008990
work_keys_str_mv AT souleantoine findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs
AT reinharzvladimir findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs
AT sarrazingendronroman findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs
AT denisealain findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs
AT waldispuhljerome findingrecurrentrnastructuralnetworkswithfastmaximalcommonsubgraphsofedgecoloredgraphs