Cargando…
50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana
Intron retention, one of the most prevalent alternative splicing events in plants, can lead to introns retained in mature mRNAs. However, in comparison with constitutively spliced introns (CSIs), the relevantly distinguishable features for retained introns (RIs) are still poorly understood. This wor...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640774/ https://www.ncbi.nlm.nih.gov/pubmed/29062321 http://dx.doi.org/10.3389/fpls.2017.01728 |
_version_ | 1783271098345848832 |
---|---|
author | Mao, Rui Liang, Chun Zhang, Yang Hao, Xingan Li, Jinyan |
author_facet | Mao, Rui Liang, Chun Zhang, Yang Hao, Xingan Li, Jinyan |
author_sort | Mao, Rui |
collection | PubMed |
description | Intron retention, one of the most prevalent alternative splicing events in plants, can lead to introns retained in mature mRNAs. However, in comparison with constitutively spliced introns (CSIs), the relevantly distinguishable features for retained introns (RIs) are still poorly understood. This work proposes a computational pipeline to discover novel RIs from multiple next-generation RNA sequencing (RNA-Seq) datasets of Arabidopsis thaliana. Using this pipeline, we detected 3,472 novel RIs from 18 RNA-Seq datasets and re-confirmed 1,384 RIs which are currently annotated in the TAIR10 database. We also use the expression of intron-containing isoforms as a new feature in addition to the conventional features. Based on these features, RIs are highly distinguishable from CSIs by machine learning methods, especially when the expressional odds of retention (i.e., the expression ratio of the RI-containing isoforms relative to the isoforms without RIs for the same gene) reaches to or larger than 50/50. In this case, the RIs and CSIs can be clearly separated by the Random Forest with an outstanding performance of 0.95 on AUC (the area under a receiver operating characteristics curve). The closely related characteristics to the RIs include the low strength of splice sites, high similarity with the flanking exon sequences, low occurrence percentage of YTRAY near the acceptor site, existence of putative intronic splicing silencers (ISSs, i.e., AG/GA-rich motifs) and intronic splicing enhancers (ISEs, i.e., TTTT-containing motifs), and enrichment of Serine/Arginine-Rich (SR) proteins and heterogeneous nuclear ribonucleoparticle proteins (hnRNPs). |
format | Online Article Text |
id | pubmed-5640774 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-56407742017-10-23 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana Mao, Rui Liang, Chun Zhang, Yang Hao, Xingan Li, Jinyan Front Plant Sci Plant Science Intron retention, one of the most prevalent alternative splicing events in plants, can lead to introns retained in mature mRNAs. However, in comparison with constitutively spliced introns (CSIs), the relevantly distinguishable features for retained introns (RIs) are still poorly understood. This work proposes a computational pipeline to discover novel RIs from multiple next-generation RNA sequencing (RNA-Seq) datasets of Arabidopsis thaliana. Using this pipeline, we detected 3,472 novel RIs from 18 RNA-Seq datasets and re-confirmed 1,384 RIs which are currently annotated in the TAIR10 database. We also use the expression of intron-containing isoforms as a new feature in addition to the conventional features. Based on these features, RIs are highly distinguishable from CSIs by machine learning methods, especially when the expressional odds of retention (i.e., the expression ratio of the RI-containing isoforms relative to the isoforms without RIs for the same gene) reaches to or larger than 50/50. In this case, the RIs and CSIs can be clearly separated by the Random Forest with an outstanding performance of 0.95 on AUC (the area under a receiver operating characteristics curve). The closely related characteristics to the RIs include the low strength of splice sites, high similarity with the flanking exon sequences, low occurrence percentage of YTRAY near the acceptor site, existence of putative intronic splicing silencers (ISSs, i.e., AG/GA-rich motifs) and intronic splicing enhancers (ISEs, i.e., TTTT-containing motifs), and enrichment of Serine/Arginine-Rich (SR) proteins and heterogeneous nuclear ribonucleoparticle proteins (hnRNPs). Frontiers Media S.A. 2017-10-09 /pmc/articles/PMC5640774/ /pubmed/29062321 http://dx.doi.org/10.3389/fpls.2017.01728 Text en Copyright © 2017 Mao, Liang, Zhang, Hao and Li. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Plant Science Mao, Rui Liang, Chun Zhang, Yang Hao, Xingan Li, Jinyan 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana |
title | 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana |
title_full | 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana |
title_fullStr | 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana |
title_full_unstemmed | 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana |
title_short | 50/50 Expressional Odds of Retention Signifies the Distinction between Retained Introns and Constitutively Spliced Introns in Arabidopsis thaliana |
title_sort | 50/50 expressional odds of retention signifies the distinction between retained introns and constitutively spliced introns in arabidopsis thaliana |
topic | Plant Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640774/ https://www.ncbi.nlm.nih.gov/pubmed/29062321 http://dx.doi.org/10.3389/fpls.2017.01728 |
work_keys_str_mv | AT maorui 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana AT liangchun 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana AT zhangyang 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana AT haoxingan 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana AT lijinyan 5050expressionaloddsofretentionsignifiesthedistinctionbetweenretainedintronsandconstitutivelysplicedintronsinarabidopsisthaliana |