Cargando…

IRBIS: a systematic search for conserved complementarity

IRBIS is a computational pipeline for detecting conserved complementary regions in unaligned orthologous sequences. Unlike other methods, it follows the “first-fold-then-align” principle in which all possible combinations of complementary k-mers are searched for simultaneous conservation. The novel...

Descripción completa

Detalles Bibliográficos
Autor principal: Pervouchine, Dmitri D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4174434/
https://www.ncbi.nlm.nih.gov/pubmed/25142064
http://dx.doi.org/10.1261/rna.045088.114
_version_ 1782336347676082176
author Pervouchine, Dmitri D.
author_facet Pervouchine, Dmitri D.
author_sort Pervouchine, Dmitri D.
collection PubMed
description IRBIS is a computational pipeline for detecting conserved complementary regions in unaligned orthologous sequences. Unlike other methods, it follows the “first-fold-then-align” principle in which all possible combinations of complementary k-mers are searched for simultaneous conservation. The novel trimming procedure reduces the size of the search space and improves the performance to the point where large-scale analyses of intra- and intermolecular RNA–RNA interactions become possible. In this article, I provide a rigorous description of the method, benchmarking on simulated and real data, and a set of stringent predictions of intramolecular RNA structure in placental mammals, drosophilids, and nematodes. I discuss two particular cases of long-range RNA structures that are likely to have a causal effect on single- and multiple-exon skipping, one in the mammalian gene Dystonin and the other in the insect gene Ca-α(1)D. In Dystonin, one of the two complementary boxes contains a binding site of Rbfox protein similar to one recently described in Enah gene. I also report that snoRNAs and long noncoding RNAs (lncRNAs) have a high capacity of base-pairing to introns of protein-coding genes, suggesting possible involvement of these transcripts in splicing regulation. I also find that conserved sequences that occur equally likely on both strands of DNA (e.g., transcription factor binding sites) contribute strongly to the false-discovery rate and, therefore, would confound every such analysis. IRBIS is an open-source software that is available at http://genome.crg.es/~dmitri/irbis/.
format Online
Article
Text
id pubmed-4174434
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-41744342014-10-02 IRBIS: a systematic search for conserved complementarity Pervouchine, Dmitri D. RNA Bioinformatics IRBIS is a computational pipeline for detecting conserved complementary regions in unaligned orthologous sequences. Unlike other methods, it follows the “first-fold-then-align” principle in which all possible combinations of complementary k-mers are searched for simultaneous conservation. The novel trimming procedure reduces the size of the search space and improves the performance to the point where large-scale analyses of intra- and intermolecular RNA–RNA interactions become possible. In this article, I provide a rigorous description of the method, benchmarking on simulated and real data, and a set of stringent predictions of intramolecular RNA structure in placental mammals, drosophilids, and nematodes. I discuss two particular cases of long-range RNA structures that are likely to have a causal effect on single- and multiple-exon skipping, one in the mammalian gene Dystonin and the other in the insect gene Ca-α(1)D. In Dystonin, one of the two complementary boxes contains a binding site of Rbfox protein similar to one recently described in Enah gene. I also report that snoRNAs and long noncoding RNAs (lncRNAs) have a high capacity of base-pairing to introns of protein-coding genes, suggesting possible involvement of these transcripts in splicing regulation. I also find that conserved sequences that occur equally likely on both strands of DNA (e.g., transcription factor binding sites) contribute strongly to the false-discovery rate and, therefore, would confound every such analysis. IRBIS is an open-source software that is available at http://genome.crg.es/~dmitri/irbis/. Cold Spring Harbor Laboratory Press 2014-10 /pmc/articles/PMC4174434/ /pubmed/25142064 http://dx.doi.org/10.1261/rna.045088.114 Text en © 2014 Pervouchine; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by/4.0/ This article, published in RNA, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Bioinformatics
Pervouchine, Dmitri D.
IRBIS: a systematic search for conserved complementarity
title IRBIS: a systematic search for conserved complementarity
title_full IRBIS: a systematic search for conserved complementarity
title_fullStr IRBIS: a systematic search for conserved complementarity
title_full_unstemmed IRBIS: a systematic search for conserved complementarity
title_short IRBIS: a systematic search for conserved complementarity
title_sort irbis: a systematic search for conserved complementarity
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4174434/
https://www.ncbi.nlm.nih.gov/pubmed/25142064
http://dx.doi.org/10.1261/rna.045088.114
work_keys_str_mv AT pervouchinedmitrid irbisasystematicsearchforconservedcomplementarity