Cargando…

Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome

Pseudogenes are abundant in the human genome and had long been thought of purely as nonfunctional gene fossils. Recent observations point to a role for pseudogenes in regulating genes transcriptionally and post-transcriptionally in human cells. To computationally interrogate the network space of int...

Descripción completa

Detalles Bibliográficos
Autores principales: Milligan, Michael J., Harvey, Erin, Yu, Albert, Morgan, Ashleigh L., Smith, Daniela L., Zhang, Eden, Berengut, Jonathan, Sivananthan, Jothini, Subramaniam, Radhini, Skoric, Aleksandra, Collins, Scott, Damski, Caio, Morris, Kevin V., Lipovich, Leonard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4805607/
https://www.ncbi.nlm.nih.gov/pubmed/27047535
http://dx.doi.org/10.3389/fgene.2016.00026
_version_ 1782423168749666304
author Milligan, Michael J.
Harvey, Erin
Yu, Albert
Morgan, Ashleigh L.
Smith, Daniela L.
Zhang, Eden
Berengut, Jonathan
Sivananthan, Jothini
Subramaniam, Radhini
Skoric, Aleksandra
Collins, Scott
Damski, Caio
Morris, Kevin V.
Lipovich, Leonard
author_facet Milligan, Michael J.
Harvey, Erin
Yu, Albert
Morgan, Ashleigh L.
Smith, Daniela L.
Zhang, Eden
Berengut, Jonathan
Sivananthan, Jothini
Subramaniam, Radhini
Skoric, Aleksandra
Collins, Scott
Damski, Caio
Morris, Kevin V.
Lipovich, Leonard
author_sort Milligan, Michael J.
collection PubMed
description Pseudogenes are abundant in the human genome and had long been thought of purely as nonfunctional gene fossils. Recent observations point to a role for pseudogenes in regulating genes transcriptionally and post-transcriptionally in human cells. To computationally interrogate the network space of integrated pseudogene and long non-coding RNA regulation in the human transcriptome, we developed and implemented an algorithm to identify all long non-coding RNA (lncRNA) transcripts that overlap the genomic spans, and specifically the exons, of any human pseudogenes in either sense or antisense orientation. As inputs to our algorithm, we imported three public repositories of pseudogenes: GENCODE v17 (processed and unprocessed, Ensembl 72); Retroposed Pseudogenes V5 (processed only), and Yale Pseudo60 (processed and unprocessed, Ensembl 60); two public lncRNA catalogs: Broad Institute, GENCODE v17; NCBI annotated piRNAs; and NHGRI clinical variants. The data sets were retrieved from the UCSC Genome Database using the UCSC Table Browser. We identified 2277 loci containing exon-to-exon overlaps between pseudogenes, both processed and unprocessed, and long non-coding RNA genes. Of these loci we identified 1167 with Genbank EST and full-length cDNA support providing direct evidence of transcription on one or both strands with exon-to-exon overlaps. The analysis converged on 313 pseudogene-lncRNA exon-to-exon overlaps that were bidirectionally supported by both full-length cDNAs and ESTs. In the process of identifying transcribed pseudogenes, we generated a comprehensive, positionally non-redundant encyclopedia of human pseudogenes, drawing upon multiple, and formerly disparate public pseudogene repositories. Collectively, these observations suggest that pseudogenes are pervasively transcribed on both strands and are common drivers of gene regulation.
format Online
Article
Text
id pubmed-4805607
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-48056072016-04-04 Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome Milligan, Michael J. Harvey, Erin Yu, Albert Morgan, Ashleigh L. Smith, Daniela L. Zhang, Eden Berengut, Jonathan Sivananthan, Jothini Subramaniam, Radhini Skoric, Aleksandra Collins, Scott Damski, Caio Morris, Kevin V. Lipovich, Leonard Front Genet Genetics Pseudogenes are abundant in the human genome and had long been thought of purely as nonfunctional gene fossils. Recent observations point to a role for pseudogenes in regulating genes transcriptionally and post-transcriptionally in human cells. To computationally interrogate the network space of integrated pseudogene and long non-coding RNA regulation in the human transcriptome, we developed and implemented an algorithm to identify all long non-coding RNA (lncRNA) transcripts that overlap the genomic spans, and specifically the exons, of any human pseudogenes in either sense or antisense orientation. As inputs to our algorithm, we imported three public repositories of pseudogenes: GENCODE v17 (processed and unprocessed, Ensembl 72); Retroposed Pseudogenes V5 (processed only), and Yale Pseudo60 (processed and unprocessed, Ensembl 60); two public lncRNA catalogs: Broad Institute, GENCODE v17; NCBI annotated piRNAs; and NHGRI clinical variants. The data sets were retrieved from the UCSC Genome Database using the UCSC Table Browser. We identified 2277 loci containing exon-to-exon overlaps between pseudogenes, both processed and unprocessed, and long non-coding RNA genes. Of these loci we identified 1167 with Genbank EST and full-length cDNA support providing direct evidence of transcription on one or both strands with exon-to-exon overlaps. The analysis converged on 313 pseudogene-lncRNA exon-to-exon overlaps that were bidirectionally supported by both full-length cDNAs and ESTs. In the process of identifying transcribed pseudogenes, we generated a comprehensive, positionally non-redundant encyclopedia of human pseudogenes, drawing upon multiple, and formerly disparate public pseudogene repositories. Collectively, these observations suggest that pseudogenes are pervasively transcribed on both strands and are common drivers of gene regulation. Frontiers Media S.A. 2016-03-24 /pmc/articles/PMC4805607/ /pubmed/27047535 http://dx.doi.org/10.3389/fgene.2016.00026 Text en Copyright © 2016 Milligan, Harvey, Yu, Morgan, Smith, Zhang, Berengut, Sivananthan, Subramaniam, Skoric, Collins, Damski, Morris and Lipovich. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Milligan, Michael J.
Harvey, Erin
Yu, Albert
Morgan, Ashleigh L.
Smith, Daniela L.
Zhang, Eden
Berengut, Jonathan
Sivananthan, Jothini
Subramaniam, Radhini
Skoric, Aleksandra
Collins, Scott
Damski, Caio
Morris, Kevin V.
Lipovich, Leonard
Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome
title Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome
title_full Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome
title_fullStr Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome
title_full_unstemmed Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome
title_short Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome
title_sort global intersection of long non-coding rnas with processed and unprocessed pseudogenes in the human genome
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4805607/
https://www.ncbi.nlm.nih.gov/pubmed/27047535
http://dx.doi.org/10.3389/fgene.2016.00026
work_keys_str_mv AT milliganmichaelj globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT harveyerin globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT yualbert globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT morganashleighl globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT smithdanielal globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT zhangeden globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT berengutjonathan globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT sivananthanjothini globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT subramaniamradhini globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT skoricaleksandra globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT collinsscott globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT damskicaio globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT morriskevinv globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome
AT lipovichleonard globalintersectionoflongnoncodingrnaswithprocessedandunprocessedpseudogenesinthehumangenome