Cargando…

Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data

BACKGROUND: Ribosomal proteins (RPs) have about 2000 pseudogenes in the human genome. While anecdotal reports for RP pseudogene transcription exists, it is unclear to what extent these pseudogenes are transcribed. The RP pseudogene transcription is difficult to identify in microarrays due to potenti...

Descripción completa

Detalles Bibliográficos
Autores principales: Tonner, Peter, Srinivasasainagendra, Vinodh, Zhang, Shaojie, Zhi, Degui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3478165/
https://www.ncbi.nlm.nih.gov/pubmed/22908858
http://dx.doi.org/10.1186/1471-2164-13-412
_version_ 1782247274382884864
author Tonner, Peter
Srinivasasainagendra, Vinodh
Zhang, Shaojie
Zhi, Degui
author_facet Tonner, Peter
Srinivasasainagendra, Vinodh
Zhang, Shaojie
Zhi, Degui
author_sort Tonner, Peter
collection PubMed
description BACKGROUND: Ribosomal proteins (RPs) have about 2000 pseudogenes in the human genome. While anecdotal reports for RP pseudogene transcription exists, it is unclear to what extent these pseudogenes are transcribed. The RP pseudogene transcription is difficult to identify in microarrays due to potential cross-hybridization between transcripts from the parent genes and pseudogenes. Recently, transcriptome sequencing (RNA-seq) provides an opportunity to ascertain the transcription of pseudogenes. A challenge for pseudogene expression discovery in RNA-seq data lies in the difficulty to uniquely identify reads mapped to pseudogene regions, which are typically also similar to the parent genes. RESULTS: Here we developed a specialized pipeline for pseudogene transcription discovery. We first construct a “composite genome” that includes the entire human genome sequence as well as mRNA sequences of real ribosomal protein genes. We then map all sequence reads to the composite genome, and only exact matches were retained. Moreover, we restrict our analysis to strictly defined mappable regions and calculate the RPKM values as measurement of pseudogene transcription levels. We report evidences for the transcription of RP pseudogenes in 16 human tissues. By analyzing the Human Body Map 2.0 study RNA-sequencing data using our pipeline, we identified that one ribosomal protein (RP) pseudogene (PGOHUM-249508) is transcribed with RPKM 170 in thyroid. Moreover, three other RP pseudogenes are transcribed with RPKM > 10, a level similar to that of the normal RP genes, in white blood cell, kidney, and testes, respectively. Furthermore, an additional thirteen RP pseudogenes are of RPKM > 5, corresponding to the 20–30 percentile among all genes. Unlike ribosomal protein genes that are constitutively expressed in almost all tissues, RP pseudogenes are differentially expressed, suggesting that they may contribute to tissue-specific biological processes. CONCLUSIONS: Using a specialized bioinformatics method, we identified the transcription of ribosomal protein pseudogenes in human tissues using RNA-seq data.
format Online
Article
Text
id pubmed-3478165
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34781652012-10-23 Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data Tonner, Peter Srinivasasainagendra, Vinodh Zhang, Shaojie Zhi, Degui BMC Genomics Research Article BACKGROUND: Ribosomal proteins (RPs) have about 2000 pseudogenes in the human genome. While anecdotal reports for RP pseudogene transcription exists, it is unclear to what extent these pseudogenes are transcribed. The RP pseudogene transcription is difficult to identify in microarrays due to potential cross-hybridization between transcripts from the parent genes and pseudogenes. Recently, transcriptome sequencing (RNA-seq) provides an opportunity to ascertain the transcription of pseudogenes. A challenge for pseudogene expression discovery in RNA-seq data lies in the difficulty to uniquely identify reads mapped to pseudogene regions, which are typically also similar to the parent genes. RESULTS: Here we developed a specialized pipeline for pseudogene transcription discovery. We first construct a “composite genome” that includes the entire human genome sequence as well as mRNA sequences of real ribosomal protein genes. We then map all sequence reads to the composite genome, and only exact matches were retained. Moreover, we restrict our analysis to strictly defined mappable regions and calculate the RPKM values as measurement of pseudogene transcription levels. We report evidences for the transcription of RP pseudogenes in 16 human tissues. By analyzing the Human Body Map 2.0 study RNA-sequencing data using our pipeline, we identified that one ribosomal protein (RP) pseudogene (PGOHUM-249508) is transcribed with RPKM 170 in thyroid. Moreover, three other RP pseudogenes are transcribed with RPKM > 10, a level similar to that of the normal RP genes, in white blood cell, kidney, and testes, respectively. Furthermore, an additional thirteen RP pseudogenes are of RPKM > 5, corresponding to the 20–30 percentile among all genes. Unlike ribosomal protein genes that are constitutively expressed in almost all tissues, RP pseudogenes are differentially expressed, suggesting that they may contribute to tissue-specific biological processes. CONCLUSIONS: Using a specialized bioinformatics method, we identified the transcription of ribosomal protein pseudogenes in human tissues using RNA-seq data. BioMed Central 2012-08-21 /pmc/articles/PMC3478165/ /pubmed/22908858 http://dx.doi.org/10.1186/1471-2164-13-412 Text en Copyright ©2012 Tonner et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tonner, Peter
Srinivasasainagendra, Vinodh
Zhang, Shaojie
Zhi, Degui
Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data
title Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data
title_full Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data
title_fullStr Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data
title_full_unstemmed Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data
title_short Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data
title_sort detecting transcription of ribosomal protein pseudogenes in diverse human tissues from rna-seq data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3478165/
https://www.ncbi.nlm.nih.gov/pubmed/22908858
http://dx.doi.org/10.1186/1471-2164-13-412
work_keys_str_mv AT tonnerpeter detectingtranscriptionofribosomalproteinpseudogenesindiversehumantissuesfromrnaseqdata
AT srinivasasainagendravinodh detectingtranscriptionofribosomalproteinpseudogenesindiversehumantissuesfromrnaseqdata
AT zhangshaojie detectingtranscriptionofribosomalproteinpseudogenesindiversehumantissuesfromrnaseqdata
AT zhidegui detectingtranscriptionofribosomalproteinpseudogenesindiversehumantissuesfromrnaseqdata