Cargando…
RNA editing in the human ENCODE RNA-seq data
RNA-seq data can be mined for sequence differences relative to the reference genome to identify both genomic SNPs and RNA editing events. We analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing event...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431480/ https://www.ncbi.nlm.nih.gov/pubmed/22955975 http://dx.doi.org/10.1101/gr.134957.111 |
_version_ | 1782242089546809344 |
---|---|
author | Park, Eddie Williams, Brian Wold, Barbara J. Mortazavi, Ali |
author_facet | Park, Eddie Williams, Brian Wold, Barbara J. Mortazavi, Ali |
author_sort | Park, Eddie |
collection | PubMed |
description | RNA-seq data can be mined for sequence differences relative to the reference genome to identify both genomic SNPs and RNA editing events. We analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing events. On average, 43% of the RNA sequencing variants that are not in dbSNP and are within gene boundaries are A-to-G(I) RNA editing candidates. The vast majority of A-to-G(I) edits are located in introns and 3′ UTRs, with only 123 located in protein-coding sequence. In contrast, the majority of non–A-to-G variants (60%–80%) map near exon boundaries and have the characteristics of splice-mapping artifacts. After filtering out all candidates with evidence of private genomic variation using genome resequencing or ChIP-seq data, we find that up to 85% of the high-confidence RNA variants are A-to-G(I) editing candidates. Genes with A-to-G(I) edits are enriched in Gene Ontology terms involving cell division, viral defense, and translation. The distribution and character of the remaining non–A-to-G variants closely resemble known SNPs. We find no reproducible A-to-G(I) edits that result in nonsynonymous substitutions in all three lymphoblastoid cell lines in our study, unlike RNA editing in the brain. Given that only a fraction of sites are reproducibly edited in multiple cell lines and that we find a stronger association of editing and specific genes suggests that the editing of the transcript is more important than the editing of any individual site. |
format | Online Article Text |
id | pubmed-3431480 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-34314802012-09-08 RNA editing in the human ENCODE RNA-seq data Park, Eddie Williams, Brian Wold, Barbara J. Mortazavi, Ali Genome Res Research RNA-seq data can be mined for sequence differences relative to the reference genome to identify both genomic SNPs and RNA editing events. We analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing events. On average, 43% of the RNA sequencing variants that are not in dbSNP and are within gene boundaries are A-to-G(I) RNA editing candidates. The vast majority of A-to-G(I) edits are located in introns and 3′ UTRs, with only 123 located in protein-coding sequence. In contrast, the majority of non–A-to-G variants (60%–80%) map near exon boundaries and have the characteristics of splice-mapping artifacts. After filtering out all candidates with evidence of private genomic variation using genome resequencing or ChIP-seq data, we find that up to 85% of the high-confidence RNA variants are A-to-G(I) editing candidates. Genes with A-to-G(I) edits are enriched in Gene Ontology terms involving cell division, viral defense, and translation. The distribution and character of the remaining non–A-to-G variants closely resemble known SNPs. We find no reproducible A-to-G(I) edits that result in nonsynonymous substitutions in all three lymphoblastoid cell lines in our study, unlike RNA editing in the brain. Given that only a fraction of sites are reproducibly edited in multiple cell lines and that we find a stronger association of editing and specific genes suggests that the editing of the transcript is more important than the editing of any individual site. Cold Spring Harbor Laboratory Press 2012-09 /pmc/articles/PMC3431480/ /pubmed/22955975 http://dx.doi.org/10.1101/gr.134957.111 Text en © 2012, Published by Cold Spring Harbor Laboratory Press This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/. |
spellingShingle | Research Park, Eddie Williams, Brian Wold, Barbara J. Mortazavi, Ali RNA editing in the human ENCODE RNA-seq data |
title | RNA editing in the human ENCODE RNA-seq data |
title_full | RNA editing in the human ENCODE RNA-seq data |
title_fullStr | RNA editing in the human ENCODE RNA-seq data |
title_full_unstemmed | RNA editing in the human ENCODE RNA-seq data |
title_short | RNA editing in the human ENCODE RNA-seq data |
title_sort | rna editing in the human encode rna-seq data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431480/ https://www.ncbi.nlm.nih.gov/pubmed/22955975 http://dx.doi.org/10.1101/gr.134957.111 |
work_keys_str_mv | AT parkeddie rnaeditinginthehumanencodernaseqdata AT williamsbrian rnaeditinginthehumanencodernaseqdata AT woldbarbaraj rnaeditinginthehumanencodernaseqdata AT mortazaviali rnaeditinginthehumanencodernaseqdata |