Cargando…

A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq

Repetitive elements (REs) comprise 40–60% of the mammalian genome and have been shown to epigenetically influence the expression of genes through the formation of fusion transcript (FTs). We previously showed that an intracisternal A particle forms an FT with the agouti gene in mice, causing obesity...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Tianyuan, Santos, Janine H., Feng, Jian, Fargo, David C., Shen, Li, Riadi, Gonzalo, Keeley, Elizabeth, Rosh, Zachary S., Nestler, Eric J., Woychik, Richard P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4945064/
https://www.ncbi.nlm.nih.gov/pubmed/27415830
http://dx.doi.org/10.1371/journal.pone.0159028
_version_ 1782442864037330944
author Wang, Tianyuan
Santos, Janine H.
Feng, Jian
Fargo, David C.
Shen, Li
Riadi, Gonzalo
Keeley, Elizabeth
Rosh, Zachary S.
Nestler, Eric J.
Woychik, Richard P.
author_facet Wang, Tianyuan
Santos, Janine H.
Feng, Jian
Fargo, David C.
Shen, Li
Riadi, Gonzalo
Keeley, Elizabeth
Rosh, Zachary S.
Nestler, Eric J.
Woychik, Richard P.
author_sort Wang, Tianyuan
collection PubMed
description Repetitive elements (REs) comprise 40–60% of the mammalian genome and have been shown to epigenetically influence the expression of genes through the formation of fusion transcript (FTs). We previously showed that an intracisternal A particle forms an FT with the agouti gene in mice, causing obesity/type 2 diabetes. To determine the frequency of FTs genome-wide, we developed a TopHat-Fusion-based analytical pipeline to identify FTs with high specificity. We applied it to an RNA-seq dataset from the nucleus accumbens (NAc) of mice repeatedly exposed to cocaine. Cocaine was previously shown to increase the expression of certain REs in this brain region. Using this pipeline that can be applied to single- or paired-end reads, we identified 438 genes expressing 813 different FTs in the NAc. Although all types of studied repeats were present in FTs, simple sequence repeats were underrepresented. Most importantly, reverse-transcription and quantitative PCR validated the expression of selected FTs in an independent cohort of animals, which also revealed that some FTs are the prominent isoforms expressed in the NAc by some genes. In other RNA-seq datasets, developmental expression as well as tissue specificity of some FTs differed from their corresponding non-fusion counterparts. Finally, in silico analysis predicted changes in the structure of proteins encoded by some FTs, potentially resulting in gain or loss of function. Collectively, these results indicate the robustness of our pipeline in detecting these new isoforms of genes, which we believe provides a valuable tool to aid in better understanding the broad role of REs in mammalian cellular biology.
format Online
Article
Text
id pubmed-4945064
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49450642016-08-08 A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq Wang, Tianyuan Santos, Janine H. Feng, Jian Fargo, David C. Shen, Li Riadi, Gonzalo Keeley, Elizabeth Rosh, Zachary S. Nestler, Eric J. Woychik, Richard P. PLoS One Research Article Repetitive elements (REs) comprise 40–60% of the mammalian genome and have been shown to epigenetically influence the expression of genes through the formation of fusion transcript (FTs). We previously showed that an intracisternal A particle forms an FT with the agouti gene in mice, causing obesity/type 2 diabetes. To determine the frequency of FTs genome-wide, we developed a TopHat-Fusion-based analytical pipeline to identify FTs with high specificity. We applied it to an RNA-seq dataset from the nucleus accumbens (NAc) of mice repeatedly exposed to cocaine. Cocaine was previously shown to increase the expression of certain REs in this brain region. Using this pipeline that can be applied to single- or paired-end reads, we identified 438 genes expressing 813 different FTs in the NAc. Although all types of studied repeats were present in FTs, simple sequence repeats were underrepresented. Most importantly, reverse-transcription and quantitative PCR validated the expression of selected FTs in an independent cohort of animals, which also revealed that some FTs are the prominent isoforms expressed in the NAc by some genes. In other RNA-seq datasets, developmental expression as well as tissue specificity of some FTs differed from their corresponding non-fusion counterparts. Finally, in silico analysis predicted changes in the structure of proteins encoded by some FTs, potentially resulting in gain or loss of function. Collectively, these results indicate the robustness of our pipeline in detecting these new isoforms of genes, which we believe provides a valuable tool to aid in better understanding the broad role of REs in mammalian cellular biology. Public Library of Science 2016-07-14 /pmc/articles/PMC4945064/ /pubmed/27415830 http://dx.doi.org/10.1371/journal.pone.0159028 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Wang, Tianyuan
Santos, Janine H.
Feng, Jian
Fargo, David C.
Shen, Li
Riadi, Gonzalo
Keeley, Elizabeth
Rosh, Zachary S.
Nestler, Eric J.
Woychik, Richard P.
A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq
title A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq
title_full A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq
title_fullStr A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq
title_full_unstemmed A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq
title_short A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq
title_sort novel analytical strategy to identify fusion transcripts between repetitive elements and protein coding-exons using rna-seq
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4945064/
https://www.ncbi.nlm.nih.gov/pubmed/27415830
http://dx.doi.org/10.1371/journal.pone.0159028
work_keys_str_mv AT wangtianyuan anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT santosjanineh anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT fengjian anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT fargodavidc anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT shenli anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT riadigonzalo anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT keeleyelizabeth anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT roshzacharys anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT nestlerericj anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT woychikrichardp anovelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT wangtianyuan novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT santosjanineh novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT fengjian novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT fargodavidc novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT shenli novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT riadigonzalo novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT keeleyelizabeth novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT roshzacharys novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT nestlerericj novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq
AT woychikrichardp novelanalyticalstrategytoidentifyfusiontranscriptsbetweenrepetitiveelementsandproteincodingexonsusingrnaseq