Cargando…

PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins

Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/...

Descripción completa

Detalles Bibliográficos
Autores principales: Laverty, Kaitlin U, Jolma, Arttu, Pour, Sara E, Zheng, Hong, Ray, Debashish, Morris, Quaid, Hughes, Timothy R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9638913/
https://www.ncbi.nlm.nih.gov/pubmed/36018788
http://dx.doi.org/10.1093/nar/gkac694
_version_ 1784825526467166208
author Laverty, Kaitlin U
Jolma, Arttu
Pour, Sara E
Zheng, Hong
Ray, Debashish
Morris, Quaid
Hughes, Timothy R
author_facet Laverty, Kaitlin U
Jolma, Arttu
Pour, Sara E
Zheng, Hong
Ray, Debashish
Morris, Quaid
Hughes, Timothy R
author_sort Laverty, Kaitlin U
collection PubMed
description Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences.
format Online
Article
Text
id pubmed-9638913
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96389132022-11-07 PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins Laverty, Kaitlin U Jolma, Arttu Pour, Sara E Zheng, Hong Ray, Debashish Morris, Quaid Hughes, Timothy R Nucleic Acids Res Methods Online Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences. Oxford University Press 2022-08-26 /pmc/articles/PMC9638913/ /pubmed/36018788 http://dx.doi.org/10.1093/nar/gkac694 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Laverty, Kaitlin U
Jolma, Arttu
Pour, Sara E
Zheng, Hong
Ray, Debashish
Morris, Quaid
Hughes, Timothy R
PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins
title PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins
title_full PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins
title_fullStr PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins
title_full_unstemmed PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins
title_short PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins
title_sort priesstess: interpretable, high-performing models of the sequence and structure preferences of rna-binding proteins
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9638913/
https://www.ncbi.nlm.nih.gov/pubmed/36018788
http://dx.doi.org/10.1093/nar/gkac694
work_keys_str_mv AT lavertykaitlinu priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins
AT jolmaarttu priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins
AT poursarae priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins
AT zhenghong priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins
AT raydebashish priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins
AT morrisquaid priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins
AT hughestimothyr priesstessinterpretablehighperformingmodelsofthesequenceandstructurepreferencesofrnabindingproteins