Cargando…

Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data

BACKGROUND: The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. RES...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Wensheng, Edwards, Andrea, Fan, Wei, Fang, Zhide, Deininger, Prescott, Zhang, Kun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765721/
https://www.ncbi.nlm.nih.gov/pubmed/23984937
http://dx.doi.org/10.1186/1471-2164-14-584
_version_ 1782283375710568448
author Zhang, Wensheng
Edwards, Andrea
Fan, Wei
Fang, Zhide
Deininger, Prescott
Zhang, Kun
author_facet Zhang, Wensheng
Edwards, Andrea
Fan, Wei
Fang, Zhide
Deininger, Prescott
Zhang, Kun
author_sort Zhang, Wensheng
collection PubMed
description BACKGROUND: The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. RESULTS: This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons’ expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. CONCLUSION: Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3′ (5′) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3′UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.
format Online
Article
Text
id pubmed-3765721
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37657212013-09-11 Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data Zhang, Wensheng Edwards, Andrea Fan, Wei Fang, Zhide Deininger, Prescott Zhang, Kun BMC Genomics Research Article BACKGROUND: The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. RESULTS: This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons’ expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. CONCLUSION: Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3′ (5′) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3′UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes. BioMed Central 2013-08-28 /pmc/articles/PMC3765721/ /pubmed/23984937 http://dx.doi.org/10.1186/1471-2164-14-584 Text en Copyright © 2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Wensheng
Edwards, Andrea
Fan, Wei
Fang, Zhide
Deininger, Prescott
Zhang, Kun
Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
title Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
title_full Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
title_fullStr Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
title_full_unstemmed Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
title_short Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data
title_sort inferring the expression variability of human transposable element-derived exons by linear model analysis of deep rna sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765721/
https://www.ncbi.nlm.nih.gov/pubmed/23984937
http://dx.doi.org/10.1186/1471-2164-14-584
work_keys_str_mv AT zhangwensheng inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata
AT edwardsandrea inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata
AT fanwei inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata
AT fangzhide inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata
AT deiningerprescott inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata
AT zhangkun inferringtheexpressionvariabilityofhumantransposableelementderivedexonsbylinearmodelanalysisofdeeprnasequencingdata