Cargando…

Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction

As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learnin...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yuedong, Li, Xiaomei, Zhao, Huiying, Zhan, Jian, Wang, Jihua, Zhou, Yaoqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5159645/
https://www.ncbi.nlm.nih.gov/pubmed/27807179
http://dx.doi.org/10.1261/rna.057364.116
_version_ 1782481804797673472
author Yang, Yuedong
Li, Xiaomei
Zhao, Huiying
Zhan, Jian
Wang, Jihua
Zhou, Yaoqi
author_facet Yang, Yuedong
Li, Xiaomei
Zhao, Huiying
Zhan, Jian
Wang, Jihua
Zhou, Yaoqi
author_sort Yang, Yuedong
collection PubMed
description As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5′ cap of 5′ and 3′ cap of 3′ untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA–protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org.
format Online
Article
Text
id pubmed-5159645
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-51596452017-01-01 Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction Yang, Yuedong Li, Xiaomei Zhao, Huiying Zhan, Jian Wang, Jihua Zhou, Yaoqi RNA Bioinformatics As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5′ cap of 5′ and 3′ cap of 3′ untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA–protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org. Cold Spring Harbor Laboratory Press 2017-01 /pmc/articles/PMC5159645/ /pubmed/27807179 http://dx.doi.org/10.1261/rna.057364.116 Text en © 2016 Yang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by/4.0/ This article, published in RNA, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Bioinformatics
Yang, Yuedong
Li, Xiaomei
Zhao, Huiying
Zhan, Jian
Wang, Jihua
Zhou, Yaoqi
Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
title Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
title_full Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
title_fullStr Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
title_full_unstemmed Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
title_short Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction
title_sort genome-scale characterization of rna tertiary structures and their functional impact by rna solvent accessibility prediction
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5159645/
https://www.ncbi.nlm.nih.gov/pubmed/27807179
http://dx.doi.org/10.1261/rna.057364.116
work_keys_str_mv AT yangyuedong genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction
AT lixiaomei genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction
AT zhaohuiying genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction
AT zhanjian genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction
AT wangjihua genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction
AT zhouyaoqi genomescalecharacterizationofrnatertiarystructuresandtheirfunctionalimpactbyrnasolventaccessibilityprediction