Cargando…
Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
BACKGROUND: The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. RES...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765721/ https://www.ncbi.nlm.nih.gov/pubmed/23984937 http://dx.doi.org/10.1186/1471-2164-14-584 |
_version_ | 1782283375710568448 |
---|---|
author | Zhang, Wensheng Edwards, Andrea Fan, Wei Fang, Zhide Deininger, Prescott Zhang, Kun |
author_facet | Zhang, Wensheng Edwards, Andrea Fan, Wei Fang, Zhide Deininger, Prescott Zhang, Kun |
author_sort | Zhang, Wensheng |
collection | PubMed |
description | BACKGROUND: The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. RESULTS: This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons’ expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. CONCLUSION: Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3′ (5′) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3′UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes. |
format | Online Article Text |
id | pubmed-3765721 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-37657212013-09-11 Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data Zhang, Wensheng Edwards, Andrea Fan, Wei Fang, Zhide Deininger, Prescott Zhang, Kun BMC Genomics Research Article BACKGROUND: The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. RESULTS: This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons’ expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. CONCLUSION: Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3′ (5′) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3′UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes. BioMed Central 2013-08-28 /pmc/articles/PMC3765721/ /pubmed/23984937 http://dx.doi.org/10.1186/1471-2164-14-584 Text en Copyright © 2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhang, Wensheng Edwards, Andrea Fan, Wei Fang, Zhide Deininger, Prescott Zhang, Kun Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data |
title | Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data |
title_full | Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data |
title_fullStr | Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data |
title_full_unstemmed | Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data |
title_short | Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data |
title_sort | inferring the expression variability of human transposable element-derived exons by linear model analysis of deep rna sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765721/ https://www.ncbi.nlm.nih.gov/pubmed/23984937 http://dx.doi.org/10.1186/1471-2164-14-584 |
work_keys_str_mv | AT zhangwensheng inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata AT edwardsandrea inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata AT fanwei inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata AT fangzhide inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata AT deiningerprescott inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata AT zhangkun inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata |