Cargando…

Identifying novel sequence variants of RNA 3D motifs

Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Zirbel, Craig L., Roll, James, Sweeney, Blake A., Petrov, Anton I., Pirrung, Meg, Leontis, Neocles B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
RNA
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551918/
https://www.ncbi.nlm.nih.gov/pubmed/26130723
http://dx.doi.org/10.1093/nar/gkv651
_version_ 1782387646181408768
author Zirbel, Craig L.
Roll, James
Sweeney, Blake A.
Petrov, Anton I.
Pirrung, Meg
Leontis, Neocles B.
author_facet Zirbel, Craig L.
Roll, James
Sweeney, Blake A.
Petrov, Anton I.
Pirrung, Meg
Leontis, Neocles B.
author_sort Zirbel, Craig L.
collection PubMed
description Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download.
format Online
Article
Text
id pubmed-4551918
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-45519182015-08-28 Identifying novel sequence variants of RNA 3D motifs Zirbel, Craig L. Roll, James Sweeney, Blake A. Petrov, Anton I. Pirrung, Meg Leontis, Neocles B. Nucleic Acids Res RNA Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. Oxford University Press 2015-09-03 2015-06-29 /pmc/articles/PMC4551918/ /pubmed/26130723 http://dx.doi.org/10.1093/nar/gkv651 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle RNA
Zirbel, Craig L.
Roll, James
Sweeney, Blake A.
Petrov, Anton I.
Pirrung, Meg
Leontis, Neocles B.
Identifying novel sequence variants of RNA 3D motifs
title Identifying novel sequence variants of RNA 3D motifs
title_full Identifying novel sequence variants of RNA 3D motifs
title_fullStr Identifying novel sequence variants of RNA 3D motifs
title_full_unstemmed Identifying novel sequence variants of RNA 3D motifs
title_short Identifying novel sequence variants of RNA 3D motifs
title_sort identifying novel sequence variants of rna 3d motifs
topic RNA
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551918/
https://www.ncbi.nlm.nih.gov/pubmed/26130723
http://dx.doi.org/10.1093/nar/gkv651
work_keys_str_mv AT zirbelcraigl identifyingnovelsequencevariantsofrna3dmotifs
AT rolljames identifyingnovelsequencevariantsofrna3dmotifs
AT sweeneyblakea identifyingnovelsequencevariantsofrna3dmotifs
AT petrovantoni identifyingnovelsequencevariantsofrna3dmotifs
AT pirrungmeg identifyingnovelsequencevariantsofrna3dmotifs
AT leontisneoclesb identifyingnovelsequencevariantsofrna3dmotifs