Cargando…

Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1

BACKGROUND: RNA-binding proteins (RBPs) play diverse roles in eukaryotic RNA processing. Despite their pervasive functions in coding and noncoding RNA biogenesis and regulation, elucidating the sequence specificities that define protein-RNA interactions remains a major challenge. Recently, CLIP-seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xin, Juan, Liran, Lv, Junjie, Wang, Kejun, Sanford, Jeremy R, Liu, Yunlong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287504/
https://www.ncbi.nlm.nih.gov/pubmed/22369183
http://dx.doi.org/10.1186/1471-2164-12-S5-S8
_version_ 1782224678292553728
author Wang, Xin
Juan, Liran
Lv, Junjie
Wang, Kejun
Sanford, Jeremy R
Liu, Yunlong
author_facet Wang, Xin
Juan, Liran
Lv, Junjie
Wang, Kejun
Sanford, Jeremy R
Liu, Yunlong
author_sort Wang, Xin
collection PubMed
description BACKGROUND: RNA-binding proteins (RBPs) play diverse roles in eukaryotic RNA processing. Despite their pervasive functions in coding and noncoding RNA biogenesis and regulation, elucidating the sequence specificities that define protein-RNA interactions remains a major challenge. Recently, CLIP-seq (Cross-linking immunoprecipitation followed by high-throughput sequencing) has been successfully implemented to study the transcriptome-wide binding patterns of SRSF1, PTBP1, NOVA and fox2 proteins. These studies either adopted traditional methods like Multiple EM for Motif Elicitation (MEME) to discover the sequence consensus of RBP's binding sites or used Z-score statistics to search for the overrepresented nucleotides of a certain size. We argue that most of these methods are not well-suited for RNA motif identification, as they are unable to incorporate the RNA structural context of protein-RNA interactions, which may affect to binding specificity. Here, we describe a novel model-based approach--RNAMotifModeler to identify the consensus of protein-RNA binding regions by integrating sequence features and RNA secondary structures. RESULTS: As an example, we implemented RNAMotifModeler on SRSF1 (SF2/ASF) CLIP-seq data. The sequence-structural consensus we identified is a purine-rich octamer 'AGAAGAAG' in a highly single-stranded RNA context. The unpaired probabilities, the probabilities of not forming pairs, are significantly higher than negative controls and the flanking sequence surrounding the binding site, indicating that SRSF1 proteins tend to bind on single-stranded RNA. Further statistical evaluations revealed that the second and fifth bases of SRSF1octamer motif have much stronger sequence specificities, but weaker single-strandedness, while the third, fourth, sixth and seventh bases are far more likely to be single-stranded, but have more degenerate sequence specificities. Therefore, we hypothesize that nucleotide specificity and secondary structure play complementary roles during binding site recognition by SRSF1. CONCLUSION: In this study, we presented a computational model to predict the sequence consensus and optimal RNA secondary structure for protein-RNA binding regions. The successful implementation on SRSF1 CLIP-seq data demonstrates great potential to improve our understanding on the binding specificity of RNA binding proteins.
format Online
Article
Text
id pubmed-3287504
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32875042012-03-01 Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1 Wang, Xin Juan, Liran Lv, Junjie Wang, Kejun Sanford, Jeremy R Liu, Yunlong BMC Genomics Research Article BACKGROUND: RNA-binding proteins (RBPs) play diverse roles in eukaryotic RNA processing. Despite their pervasive functions in coding and noncoding RNA biogenesis and regulation, elucidating the sequence specificities that define protein-RNA interactions remains a major challenge. Recently, CLIP-seq (Cross-linking immunoprecipitation followed by high-throughput sequencing) has been successfully implemented to study the transcriptome-wide binding patterns of SRSF1, PTBP1, NOVA and fox2 proteins. These studies either adopted traditional methods like Multiple EM for Motif Elicitation (MEME) to discover the sequence consensus of RBP's binding sites or used Z-score statistics to search for the overrepresented nucleotides of a certain size. We argue that most of these methods are not well-suited for RNA motif identification, as they are unable to incorporate the RNA structural context of protein-RNA interactions, which may affect to binding specificity. Here, we describe a novel model-based approach--RNAMotifModeler to identify the consensus of protein-RNA binding regions by integrating sequence features and RNA secondary structures. RESULTS: As an example, we implemented RNAMotifModeler on SRSF1 (SF2/ASF) CLIP-seq data. The sequence-structural consensus we identified is a purine-rich octamer 'AGAAGAAG' in a highly single-stranded RNA context. The unpaired probabilities, the probabilities of not forming pairs, are significantly higher than negative controls and the flanking sequence surrounding the binding site, indicating that SRSF1 proteins tend to bind on single-stranded RNA. Further statistical evaluations revealed that the second and fifth bases of SRSF1octamer motif have much stronger sequence specificities, but weaker single-strandedness, while the third, fourth, sixth and seventh bases are far more likely to be single-stranded, but have more degenerate sequence specificities. Therefore, we hypothesize that nucleotide specificity and secondary structure play complementary roles during binding site recognition by SRSF1. CONCLUSION: In this study, we presented a computational model to predict the sequence consensus and optimal RNA secondary structure for protein-RNA binding regions. The successful implementation on SRSF1 CLIP-seq data demonstrates great potential to improve our understanding on the binding specificity of RNA binding proteins. BioMed Central 2011-12-23 /pmc/articles/PMC3287504/ /pubmed/22369183 http://dx.doi.org/10.1186/1471-2164-12-S5-S8 Text en Copyright ©2011 Wang et al. licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wang, Xin
Juan, Liran
Lv, Junjie
Wang, Kejun
Sanford, Jeremy R
Liu, Yunlong
Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1
title Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1
title_full Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1
title_fullStr Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1
title_full_unstemmed Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1
title_short Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1
title_sort predicting sequence and structural specificities of rna binding regions recognized by splicing factor srsf1
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287504/
https://www.ncbi.nlm.nih.gov/pubmed/22369183
http://dx.doi.org/10.1186/1471-2164-12-S5-S8
work_keys_str_mv AT wangxin predictingsequenceandstructuralspecificitiesofrnabindingregionsrecognizedbysplicingfactorsrsf1
AT juanliran predictingsequenceandstructuralspecificitiesofrnabindingregionsrecognizedbysplicingfactorsrsf1
AT lvjunjie predictingsequenceandstructuralspecificitiesofrnabindingregionsrecognizedbysplicingfactorsrsf1
AT wangkejun predictingsequenceandstructuralspecificitiesofrnabindingregionsrecognizedbysplicingfactorsrsf1
AT sanfordjeremyr predictingsequenceandstructuralspecificitiesofrnabindingregionsrecognizedbysplicingfactorsrsf1
AT liuyunlong predictingsequenceandstructuralspecificitiesofrnabindingregionsrecognizedbysplicingfactorsrsf1