Cargando…
Analyzing allele specific RNA expression using mixture models
BACKGROUND: Measuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression. Widespread adoption of high-throughput sequencing technologies for studying RNA expression (RNA-Seq) permits measurement of allelic RNA expression imb...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521363/ https://www.ncbi.nlm.nih.gov/pubmed/26231172 http://dx.doi.org/10.1186/s12864-015-1749-0 |
_version_ | 1782383804303802368 |
---|---|
author | Lu, Rong Smith, Ryan M Seweryn, Michal Wang, Danxin Hartmann, Katherine Webb, Amy Sadee, Wolfgang Rempala, Grzegorz A |
author_facet | Lu, Rong Smith, Ryan M Seweryn, Michal Wang, Danxin Hartmann, Katherine Webb, Amy Sadee, Wolfgang Rempala, Grzegorz A |
author_sort | Lu, Rong |
collection | PubMed |
description | BACKGROUND: Measuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression. Widespread adoption of high-throughput sequencing technologies for studying RNA expression (RNA-Seq) permits measurement of allelic RNA expression imbalance (AEI) at heterozygous single nucleotide polymorphisms (SNPs) across the entire transcriptome, and this approach has become especially popular with the emergence of large databases, such as GTEx. However, the existing binomial-type methods used to model allelic expression from RNA-seq assume a strong negative correlation between reference and variant allele reads, which may not be reasonable biologically. RESULTS: Here we propose a new strategy for AEI analysis using RNA-seq data. Under the null hypothesis of no AEI, a group of SNPs (possibly across multiple genes) is considered comparable if their respective total sums of the allelic reads are of similar magnitude. Within each group of “comparable” SNPs, we identify SNPs with AEI signal by fitting a mixture of folded Skellam distributions to the absolute values of read differences. By applying this methodology to RNA-Seq data from human autopsy brain tissues, we identified numerous instances of moderate to strong imbalanced allelic RNA expression at heterozygous SNPs. Findings with SLC1A3 mRNA exhibiting known expression differences are discussed as examples. CONCLUSION: The folded Skellam mixture model searches for SNPs with significant difference between reference and variant allele reads (adjusted for different library sizes), using information from a group of “comparable” SNPs across multiple genes. This model is particularly suitable for performing AEI analysis on genes with few heterozygous SNPs available from RNA-seq, and it can fit over-dispersed read counts without specifying the direction of the correlation between reference and variant alleles. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1749-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4521363 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45213632015-08-01 Analyzing allele specific RNA expression using mixture models Lu, Rong Smith, Ryan M Seweryn, Michal Wang, Danxin Hartmann, Katherine Webb, Amy Sadee, Wolfgang Rempala, Grzegorz A BMC Genomics Methodology Article BACKGROUND: Measuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression. Widespread adoption of high-throughput sequencing technologies for studying RNA expression (RNA-Seq) permits measurement of allelic RNA expression imbalance (AEI) at heterozygous single nucleotide polymorphisms (SNPs) across the entire transcriptome, and this approach has become especially popular with the emergence of large databases, such as GTEx. However, the existing binomial-type methods used to model allelic expression from RNA-seq assume a strong negative correlation between reference and variant allele reads, which may not be reasonable biologically. RESULTS: Here we propose a new strategy for AEI analysis using RNA-seq data. Under the null hypothesis of no AEI, a group of SNPs (possibly across multiple genes) is considered comparable if their respective total sums of the allelic reads are of similar magnitude. Within each group of “comparable” SNPs, we identify SNPs with AEI signal by fitting a mixture of folded Skellam distributions to the absolute values of read differences. By applying this methodology to RNA-Seq data from human autopsy brain tissues, we identified numerous instances of moderate to strong imbalanced allelic RNA expression at heterozygous SNPs. Findings with SLC1A3 mRNA exhibiting known expression differences are discussed as examples. CONCLUSION: The folded Skellam mixture model searches for SNPs with significant difference between reference and variant allele reads (adjusted for different library sizes), using information from a group of “comparable” SNPs across multiple genes. This model is particularly suitable for performing AEI analysis on genes with few heterozygous SNPs available from RNA-seq, and it can fit over-dispersed read counts without specifying the direction of the correlation between reference and variant alleles. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1749-0) contains supplementary material, which is available to authorized users. BioMed Central 2015-08-01 /pmc/articles/PMC4521363/ /pubmed/26231172 http://dx.doi.org/10.1186/s12864-015-1749-0 Text en © Lu et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Lu, Rong Smith, Ryan M Seweryn, Michal Wang, Danxin Hartmann, Katherine Webb, Amy Sadee, Wolfgang Rempala, Grzegorz A Analyzing allele specific RNA expression using mixture models |
title | Analyzing allele specific RNA expression using mixture models |
title_full | Analyzing allele specific RNA expression using mixture models |
title_fullStr | Analyzing allele specific RNA expression using mixture models |
title_full_unstemmed | Analyzing allele specific RNA expression using mixture models |
title_short | Analyzing allele specific RNA expression using mixture models |
title_sort | analyzing allele specific rna expression using mixture models |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521363/ https://www.ncbi.nlm.nih.gov/pubmed/26231172 http://dx.doi.org/10.1186/s12864-015-1749-0 |
work_keys_str_mv | AT lurong analyzingallelespecificrnaexpressionusingmixturemodels AT smithryanm analyzingallelespecificrnaexpressionusingmixturemodels AT sewerynmichal analyzingallelespecificrnaexpressionusingmixturemodels AT wangdanxin analyzingallelespecificrnaexpressionusingmixturemodels AT hartmannkatherine analyzingallelespecificrnaexpressionusingmixturemodels AT webbamy analyzingallelespecificrnaexpressionusingmixturemodels AT sadeewolfgang analyzingallelespecificrnaexpressionusingmixturemodels AT rempalagrzegorza analyzingallelespecificrnaexpressionusingmixturemodels |