Cargando…

Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome

BACKGROUND: Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jia Qian, Du, Jiang, Rozowsky, Joel, Zhang, Zhengdong, Urban, Alexander E, Euskirchen, Ghia, Weissman, Sherman, Gerstein, Mark, Snyder, Michael
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2395237/
https://www.ncbi.nlm.nih.gov/pubmed/18173853
http://dx.doi.org/10.1186/gb-2008-9-1-r3
_version_ 1782155460655185920
author Wu, Jia Qian
Du, Jiang
Rozowsky, Joel
Zhang, Zhengdong
Urban, Alexander E
Euskirchen, Ghia
Weissman, Sherman
Gerstein, Mark
Snyder, Michael
author_facet Wu, Jia Qian
Du, Jiang
Rozowsky, Joel
Zhang, Zhengdong
Urban, Alexander E
Euskirchen, Ghia
Weissman, Sherman
Gerstein, Mark
Snyder, Michael
author_sort Wu, Jia Qian
collection PubMed
description BACKGROUND: Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. RESULTS: We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. CONCLUSION: We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional.
format Text
id pubmed-2395237
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23952372008-05-24 Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome Wu, Jia Qian Du, Jiang Rozowsky, Joel Zhang, Zhengdong Urban, Alexander E Euskirchen, Ghia Weissman, Sherman Gerstein, Mark Snyder, Michael Genome Biol Research BACKGROUND: Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. RESULTS: We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. CONCLUSION: We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional. BioMed Central 2008-01-03 /pmc/articles/PMC2395237/ /pubmed/18173853 http://dx.doi.org/10.1186/gb-2008-9-1-r3 Text en Copyright © 2008 Wu et al.; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Wu, Jia Qian
Du, Jiang
Rozowsky, Joel
Zhang, Zhengdong
Urban, Alexander E
Euskirchen, Ghia
Weissman, Sherman
Gerstein, Mark
Snyder, Michael
Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome
title Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome
title_full Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome
title_fullStr Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome
title_full_unstemmed Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome
title_short Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome
title_sort systematic analysis of transcribed loci in encode regions using race sequencing reveals extensive transcription in the human genome
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2395237/
https://www.ncbi.nlm.nih.gov/pubmed/18173853
http://dx.doi.org/10.1186/gb-2008-9-1-r3
work_keys_str_mv AT wujiaqian systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT dujiang systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT rozowskyjoel systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT zhangzhengdong systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT urbanalexandere systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT euskirchenghia systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT weissmansherman systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT gersteinmark systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome
AT snydermichael systematicanalysisoftranscribedlociinencoderegionsusingracesequencingrevealsextensivetranscriptioninthehumangenome