Cargando…
Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15
BACKGROUND: Proteins recognize many different aspects of RNA ranging from single stranded regions to discrete secondary or tertiary structures. High-throughput sequencing (HTS) of in vitro selected populations offers a large scale method to study RNA-proteins interactions. However, most existing ana...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5461778/ https://www.ncbi.nlm.nih.gov/pubmed/28587636 http://dx.doi.org/10.1186/s12859-017-1704-y |
_version_ | 1783242408207581184 |
---|---|
author | Pei, Shermin Slinger, Betty L. Meyer, Michelle M. |
author_facet | Pei, Shermin Slinger, Betty L. Meyer, Michelle M. |
author_sort | Pei, Shermin |
collection | PubMed |
description | BACKGROUND: Proteins recognize many different aspects of RNA ranging from single stranded regions to discrete secondary or tertiary structures. High-throughput sequencing (HTS) of in vitro selected populations offers a large scale method to study RNA-proteins interactions. However, most existing analysis methods require that the binding motifs are enriched in the population relative to earlier rounds, and that motifs are found in a loop or single stranded region of the potential RNA secondary structure. Such methods do not generalize to all RNA-protein interaction as some RNA binding proteins specifically recognize more complex structures such as double stranded RNA. RESULTS: In this study, we use HT-SELEX derived populations to study the landscape of RNAs that interact with Geobacillus kaustophilus ribosomal protein S15. Our data show high sequence and structure diversity and proved intractable to existing methods. Conventional programs identified some sequence motifs, but these are found in less than 5-10% of the total sequence pool. Therefore, we developed a novel framework to analyze HT-SELEX data. Our process accounts for both sequence and structure components by abstracting the overall secondary structure into smaller substructures composed of a single base-pair stack, which allows us to leverage existing approaches already used in k-mer analysis to identify enriched motifs. By focusing on secondary structure motifs composed of specific two base-pair stacks, we identified significantly enriched or depleted structure motifs relative to earlier rounds. CONCLUSIONS: Discrete substructures are likely to be important to RNA-protein interactions, but they are difficult to elucidate. Substructures can help make highly diverse sequence data more tractable. The structure motifs provide limited accuracy in predicting enrichment suggesting that G. kaustophilus S15 can either recognize many different secondary structure motifs or some aspects of the interaction are not captured by the analysis. This highlights the importance of considering secondary and tertiary structure elements and their role in RNA-protein interactions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1704-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5461778 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-54617782017-06-08 Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 Pei, Shermin Slinger, Betty L. Meyer, Michelle M. BMC Bioinformatics Research Article BACKGROUND: Proteins recognize many different aspects of RNA ranging from single stranded regions to discrete secondary or tertiary structures. High-throughput sequencing (HTS) of in vitro selected populations offers a large scale method to study RNA-proteins interactions. However, most existing analysis methods require that the binding motifs are enriched in the population relative to earlier rounds, and that motifs are found in a loop or single stranded region of the potential RNA secondary structure. Such methods do not generalize to all RNA-protein interaction as some RNA binding proteins specifically recognize more complex structures such as double stranded RNA. RESULTS: In this study, we use HT-SELEX derived populations to study the landscape of RNAs that interact with Geobacillus kaustophilus ribosomal protein S15. Our data show high sequence and structure diversity and proved intractable to existing methods. Conventional programs identified some sequence motifs, but these are found in less than 5-10% of the total sequence pool. Therefore, we developed a novel framework to analyze HT-SELEX data. Our process accounts for both sequence and structure components by abstracting the overall secondary structure into smaller substructures composed of a single base-pair stack, which allows us to leverage existing approaches already used in k-mer analysis to identify enriched motifs. By focusing on secondary structure motifs composed of specific two base-pair stacks, we identified significantly enriched or depleted structure motifs relative to earlier rounds. CONCLUSIONS: Discrete substructures are likely to be important to RNA-protein interactions, but they are difficult to elucidate. Substructures can help make highly diverse sequence data more tractable. The structure motifs provide limited accuracy in predicting enrichment suggesting that G. kaustophilus S15 can either recognize many different secondary structure motifs or some aspects of the interaction are not captured by the analysis. This highlights the importance of considering secondary and tertiary structure elements and their role in RNA-protein interactions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1704-y) contains supplementary material, which is available to authorized users. BioMed Central 2017-06-06 /pmc/articles/PMC5461778/ /pubmed/28587636 http://dx.doi.org/10.1186/s12859-017-1704-y Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Pei, Shermin Slinger, Betty L. Meyer, Michelle M. Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 |
title | Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 |
title_full | Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 |
title_fullStr | Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 |
title_full_unstemmed | Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 |
title_short | Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15 |
title_sort | recognizing rna structural motifs in ht-selex data for ribosomal protein s15 |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5461778/ https://www.ncbi.nlm.nih.gov/pubmed/28587636 http://dx.doi.org/10.1186/s12859-017-1704-y |
work_keys_str_mv | AT peishermin recognizingrnastructuralmotifsinhtselexdataforribosomalproteins15 AT slingerbettyl recognizingrnastructuralmotifsinhtselexdataforribosomalproteins15 AT meyermichellem recognizingrnastructuralmotifsinhtselexdataforribosomalproteins15 |