Cargando…

Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP

BACKGROUND: Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA–protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specif...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuret, Klara, Amalietti, Aram Gustav, Jones, D. Marc, Capitanchik, Charlotte, Ule, Jernej
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9461102/
https://www.ncbi.nlm.nih.gov/pubmed/36085079
http://dx.doi.org/10.1186/s13059-022-02755-2
Descripción
Sumario:BACKGROUND: Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA–protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells. RESULTS: We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins. CONCLUSIONS: Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner (https://imaps.goodwright.com/apps/peka/). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02755-2.