Cargando…
Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model
BACKGROUND: High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819473/ https://www.ncbi.nlm.nih.gov/pubmed/31660858 http://dx.doi.org/10.1186/s12859-019-3141-6 |
_version_ | 1783463738652753920 |
---|---|
author | Xie, Jing Ji, Tieming Ferreira, Marco A. R. Li, Yahan Patel, Bhaumik N. Rivera, Rocio M. |
author_facet | Xie, Jing Ji, Tieming Ferreira, Marco A. R. Li, Yahan Patel, Bhaumik N. Rivera, Rocio M. |
author_sort | Xie, Jing |
collection | PubMed |
description | BACKGROUND: High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of gene expression. Specifically, existing methods do not test allele-specific expression (ASE) of a gene as a whole and variation in ASE within a gene across exons separately and simultaneously. RESULTS: We propose a generalized linear mixed model to close these gaps, incorporating variations due to genes, single nucleotide polymorphisms (SNPs), and biological replicates. To improve reliability of statistical inferences, we assign priors on each effect in the model so that information is shared across genes in the entire genome. We utilize Bayesian model selection to test the hypothesis of ASE for each gene and variations across SNPs within a gene. We apply our method to four tissue types in a bovine study to de novo detect ASE genes in the bovine genome, and uncover intriguing predictions of regulatory ASEs across gene exons and across tissue types. We compared our method to competing approaches through simulation studies that mimicked the real datasets. The R package, BLMRM, that implements our proposed algorithm, is publicly available for download at https://github.com/JingXieMIZZOU/BLMRM. CONCLUSIONS: We will show that the proposed method exhibits improved control of the false discovery rate and improved power over existing methods when SNP variation and biological variation are present. Besides, our method also maintains low computational requirements that allows for whole genome analysis. |
format | Online Article Text |
id | pubmed-6819473 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68194732019-10-31 Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model Xie, Jing Ji, Tieming Ferreira, Marco A. R. Li, Yahan Patel, Bhaumik N. Rivera, Rocio M. BMC Bioinformatics Methodology Article BACKGROUND: High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of gene expression. Specifically, existing methods do not test allele-specific expression (ASE) of a gene as a whole and variation in ASE within a gene across exons separately and simultaneously. RESULTS: We propose a generalized linear mixed model to close these gaps, incorporating variations due to genes, single nucleotide polymorphisms (SNPs), and biological replicates. To improve reliability of statistical inferences, we assign priors on each effect in the model so that information is shared across genes in the entire genome. We utilize Bayesian model selection to test the hypothesis of ASE for each gene and variations across SNPs within a gene. We apply our method to four tissue types in a bovine study to de novo detect ASE genes in the bovine genome, and uncover intriguing predictions of regulatory ASEs across gene exons and across tissue types. We compared our method to competing approaches through simulation studies that mimicked the real datasets. The R package, BLMRM, that implements our proposed algorithm, is publicly available for download at https://github.com/JingXieMIZZOU/BLMRM. CONCLUSIONS: We will show that the proposed method exhibits improved control of the false discovery rate and improved power over existing methods when SNP variation and biological variation are present. Besides, our method also maintains low computational requirements that allows for whole genome analysis. BioMed Central 2019-10-28 /pmc/articles/PMC6819473/ /pubmed/31660858 http://dx.doi.org/10.1186/s12859-019-3141-6 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Xie, Jing Ji, Tieming Ferreira, Marco A. R. Li, Yahan Patel, Bhaumik N. Rivera, Rocio M. Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model |
title | Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model |
title_full | Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model |
title_fullStr | Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model |
title_full_unstemmed | Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model |
title_short | Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model |
title_sort | modeling allele-specific expression at the gene and snp levels simultaneously by a bayesian logistic mixed regression model |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819473/ https://www.ncbi.nlm.nih.gov/pubmed/31660858 http://dx.doi.org/10.1186/s12859-019-3141-6 |
work_keys_str_mv | AT xiejing modelingallelespecificexpressionatthegeneandsnplevelssimultaneouslybyabayesianlogisticmixedregressionmodel AT jitieming modelingallelespecificexpressionatthegeneandsnplevelssimultaneouslybyabayesianlogisticmixedregressionmodel AT ferreiramarcoar modelingallelespecificexpressionatthegeneandsnplevelssimultaneouslybyabayesianlogisticmixedregressionmodel AT liyahan modelingallelespecificexpressionatthegeneandsnplevelssimultaneouslybyabayesianlogisticmixedregressionmodel AT patelbhaumikn modelingallelespecificexpressionatthegeneandsnplevelssimultaneouslybyabayesianlogisticmixedregressionmodel AT riverarociom modelingallelespecificexpressionatthegeneandsnplevelssimultaneouslybyabayesianlogisticmixedregressionmodel |