Cargando…

Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechani...

Descripción completa

Detalles Bibliográficos
Autores principales: Dadaneh, Siamak Zamani, de Figueiredo, Paul, Sze, Sing-Hoi, Zhou, Mingyuan, Qian, Xiaoning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7487589/
https://www.ncbi.nlm.nih.gov/pubmed/32900358
http://dx.doi.org/10.1186/s12864-020-06938-8
_version_ 1783581518049837056
author Dadaneh, Siamak Zamani
de Figueiredo, Paul
Sze, Sing-Hoi
Zhou, Mingyuan
Qian, Xiaoning
author_facet Dadaneh, Siamak Zamani
de Figueiredo, Paul
Sze, Sing-Hoi
Zhou, Mingyuan
Qian, Xiaoning
author_sort Dadaneh, Siamak Zamani
collection PubMed
description BACKGROUND: Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechanisms. The unique analytic challenge is to appropriately model highly over-dispersed scRNA-seq count data with prevalent dropouts (zero counts), making zero-inflated dimensionality reduction techniques popular for scRNA-seq data analyses. Employing zero-inflated distributions, however, may place extra emphasis on zero counts, leading to potential bias when identifying the latent structure of the data. RESULTS: In this paper, we propose a fully generative hierarchical gamma-negative binomial (hGNB) model of scRNA-seq data, obviating the need for explicitly modeling zero inflation. At the same time, hGNB can naturally account for covariate effects at both the gene and cell levels to identify complex latent representations of scRNA-seq data, without the need for commonly adopted pre-processing steps such as normalization. Efficient Bayesian model inference is derived by exploiting conditional conjugacy via novel data augmentation techniques. CONCLUSION: Experimental results on both simulated data and several real-world scRNA-seq datasets suggest that hGNB is a powerful tool for cell cluster discovery as well as cell lineage inference.
format Online
Article
Text
id pubmed-7487589
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74875892020-09-15 Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data Dadaneh, Siamak Zamani de Figueiredo, Paul Sze, Sing-Hoi Zhou, Mingyuan Qian, Xiaoning BMC Genomics Research BACKGROUND: Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechanisms. The unique analytic challenge is to appropriately model highly over-dispersed scRNA-seq count data with prevalent dropouts (zero counts), making zero-inflated dimensionality reduction techniques popular for scRNA-seq data analyses. Employing zero-inflated distributions, however, may place extra emphasis on zero counts, leading to potential bias when identifying the latent structure of the data. RESULTS: In this paper, we propose a fully generative hierarchical gamma-negative binomial (hGNB) model of scRNA-seq data, obviating the need for explicitly modeling zero inflation. At the same time, hGNB can naturally account for covariate effects at both the gene and cell levels to identify complex latent representations of scRNA-seq data, without the need for commonly adopted pre-processing steps such as normalization. Efficient Bayesian model inference is derived by exploiting conditional conjugacy via novel data augmentation techniques. CONCLUSION: Experimental results on both simulated data and several real-world scRNA-seq datasets suggest that hGNB is a powerful tool for cell cluster discovery as well as cell lineage inference. BioMed Central 2020-09-09 /pmc/articles/PMC7487589/ /pubmed/32900358 http://dx.doi.org/10.1186/s12864-020-06938-8 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Dadaneh, Siamak Zamani
de Figueiredo, Paul
Sze, Sing-Hoi
Zhou, Mingyuan
Qian, Xiaoning
Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data
title Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data
title_full Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data
title_fullStr Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data
title_full_unstemmed Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data
title_short Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data
title_sort bayesian gamma-negative binomial modeling of single-cell rna sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7487589/
https://www.ncbi.nlm.nih.gov/pubmed/32900358
http://dx.doi.org/10.1186/s12864-020-06938-8
work_keys_str_mv AT dadanehsiamakzamani bayesiangammanegativebinomialmodelingofsinglecellrnasequencingdata
AT defigueiredopaul bayesiangammanegativebinomialmodelingofsinglecellrnasequencingdata
AT szesinghoi bayesiangammanegativebinomialmodelingofsinglecellrnasequencingdata
AT zhoumingyuan bayesiangammanegativebinomialmodelingofsinglecellrnasequencingdata
AT qianxiaoning bayesiangammanegativebinomialmodelingofsinglecellrnasequencingdata