Cargando…

50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana

Intron retention, one of the most prevalent alternative splicing events in plants, can lead to introns retained in mature mRNAs. However, in comparison with constitutively spliced introns (CSIs), the relevantly distinguishable features for retained introns (RIs) are still poorly understood. This wor...

Descripción completa

Detalles Bibliográficos
Autores principales: Mao, Rui, Liang, Chun, Zhang, Yang, Hao, Xingan, Li, Jinyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640774/
https://www.ncbi.nlm.nih.gov/pubmed/29062321
http://dx.doi.org/10.3389/fpls.2017.01728
_version_ 1783271098345848832
author Mao, Rui
Liang, Chun
Zhang, Yang
Hao, Xingan
Li, Jinyan
author_facet Mao, Rui
Liang, Chun
Zhang, Yang
Hao, Xingan
Li, Jinyan
author_sort Mao, Rui
collection PubMed
description Intron retention, one of the most prevalent alternative splicing events in plants, can lead to introns retained in mature mRNAs. However, in comparison with constitutively spliced introns (CSIs), the relevantly distinguishable features for retained introns (RIs) are still poorly understood. This work proposes a computational pipeline to discover novel RIs from multiple next-generation RNA sequencing (RNA-Seq) datasets of Arabidopsis thaliana. Using this pipeline, we detected 3,472 novel RIs from 18 RNA-Seq datasets and re-confirmed 1,384 RIs which are currently annotated in the TAIR10 database. We also use the expression of intron-containing isoforms as a new feature in addition to the conventional features. Based on these features, RIs are highly distinguishable from CSIs by machine learning methods, especially when the expressional odds of retention (i.e., the expression ratio of the RI-containing isoforms relative to the isoforms without RIs for the same gene) reaches to or larger than 50/50. In this case, the RIs and CSIs can be clearly separated by the Random Forest with an outstanding performance of 0.95 on AUC (the area under a receiver operating characteristics curve). The closely related characteristics to the RIs include the low strength of splice sites, high similarity with the flanking exon sequences, low occurrence percentage of YTRAY near the acceptor site, existence of putative intronic splicing silencers (ISSs, i.e., AG/GA-rich motifs) and intronic splicing enhancers (ISEs, i.e., TTTT-containing motifs), and enrichment of Serine/Arginine-Rich (SR) proteins and heterogeneous nuclear ribonucleoparticle proteins (hnRNPs).
format Online
Article
Text
id pubmed-5640774
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-56407742017-10-23 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana Mao, Rui Liang, Chun Zhang, Yang Hao, Xingan Li, Jinyan Front Plant Sci Plant Science Intron retention, one of the most prevalent alternative splicing events in plants, can lead to introns retained in mature mRNAs. However, in comparison with constitutively spliced introns (CSIs), the relevantly distinguishable features for retained introns (RIs) are still poorly understood. This work proposes a computational pipeline to discover novel RIs from multiple next-generation RNA sequencing (RNA-Seq) datasets of Arabidopsis thaliana. Using this pipeline, we detected 3,472 novel RIs from 18 RNA-Seq datasets and re-confirmed 1,384 RIs which are currently annotated in the TAIR10 database. We also use the expression of intron-containing isoforms as a new feature in addition to the conventional features. Based on these features, RIs are highly distinguishable from CSIs by machine learning methods, especially when the expressional odds of retention (i.e., the expression ratio of the RI-containing isoforms relative to the isoforms without RIs for the same gene) reaches to or larger than 50/50. In this case, the RIs and CSIs can be clearly separated by the Random Forest with an outstanding performance of 0.95 on AUC (the area under a receiver operating characteristics curve). The closely related characteristics to the RIs include the low strength of splice sites, high similarity with the flanking exon sequences, low occurrence percentage of YTRAY near the acceptor site, existence of putative intronic splicing silencers (ISSs, i.e., AG/GA-rich motifs) and intronic splicing enhancers (ISEs, i.e., TTTT-containing motifs), and enrichment of Serine/Arginine-Rich (SR) proteins and heterogeneous nuclear ribonucleoparticle proteins (hnRNPs). Frontiers Media S.A. 2017-10-09 /pmc/articles/PMC5640774/ /pubmed/29062321 http://dx.doi.org/10.3389/fpls.2017.01728 Text en Copyright © 2017 Mao, Liang, Zhang, Hao and Li. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Mao, Rui
Liang, Chun
Zhang, Yang
Hao, Xingan
Li, Jinyan
50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
title 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
title_full 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
title_fullStr 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
title_full_unstemmed 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
title_short 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
title_sort 50/50 expressional odds of retention signifies the distinction between retained introns and constitutively spliced introns in arabidopsis thaliana
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640774/
https://www.ncbi.nlm.nih.gov/pubmed/29062321
http://dx.doi.org/10.3389/fpls.2017.01728
work_keys_str_mv AT maorui 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana
AT liangchun 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana
AT zhangyang 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana
AT haoxingan 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana
AT lijinyan 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana