Cargando…
Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learnin...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5159645/ https://www.ncbi.nlm.nih.gov/pubmed/27807179 http://dx.doi.org/10.1261/rna.057364.116 |
_version_ | 1782481804797673472 |
---|---|
author | Yang, Yuedong Li, Xiaomei Zhao, Huiying Zhan, Jian Wang, Jihua Zhou, Yaoqi |
author_facet | Yang, Yuedong Li, Xiaomei Zhao, Huiying Zhan, Jian Wang, Jihua Zhou, Yaoqi |
author_sort | Yang, Yuedong |
collection | PubMed |
description | As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5′ cap of 5′ and 3′ cap of 3′ untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA–protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org. |
format | Online Article Text |
id | pubmed-5159645 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-51596452017-01-01 Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction Yang, Yuedong Li, Xiaomei Zhao, Huiying Zhan, Jian Wang, Jihua Zhou, Yaoqi RNA Bioinformatics As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5′ cap of 5′ and 3′ cap of 3′ untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA–protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org. Cold Spring Harbor Laboratory Press 2017-01 /pmc/articles/PMC5159645/ /pubmed/27807179 http://dx.doi.org/10.1261/rna.057364.116 Text en © 2016 Yang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by/4.0/ This article, published in RNA, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Bioinformatics Yang, Yuedong Li, Xiaomei Zhao, Huiying Zhan, Jian Wang, Jihua Zhou, Yaoqi Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction |
title | Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction |
title_full | Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction |
title_fullStr | Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction |
title_full_unstemmed | Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction |
title_short | Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction |
title_sort | genome-scale characterization of rna tertiary structures and their functional impact by rna solvent accessibility prediction |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5159645/ https://www.ncbi.nlm.nih.gov/pubmed/27807179 http://dx.doi.org/10.1261/rna.057364.116 |
work_keys_str_mv | AT yangyuedong genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction AT lixiaomei genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction AT zhaohuiying genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction AT zhanjian genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction AT wangjihua genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction AT zhouyaoqi genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction |