Cargando…

Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics

BACKGROUND: Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatme...

Descripción completa

Detalles Bibliográficos
Autores principales: Choi, Kwangbom, Chen, Yang, Skelly, Daniel A., Churchill, Gary A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384222/
https://www.ncbi.nlm.nih.gov/pubmed/32718323
http://dx.doi.org/10.1186/s13059-020-02103-2
_version_ 1783563584287014912
author Choi, Kwangbom
Chen, Yang
Skelly, Daniel A.
Churchill, Gary A.
author_facet Choi, Kwangbom
Chen, Yang
Skelly, Daniel A.
Churchill, Gary A.
author_sort Choi, Kwangbom
collection PubMed
description BACKGROUND: Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatment of zeros and no consensus on whether zero-inflated count distributions are necessary or even useful. While some studies assume the existence of zero inflation due to technical artifacts and attempt to impute the missing information, other recent studies argue that there is no zero inflation in scRNA-seq data. RESULTS: We apply a Bayesian model selection approach to unambiguously demonstrate zero inflation in multiple biologically realistic scRNA-seq datasets. We show that the primary causes of zero inflation are not technical but rather biological in nature. We also demonstrate that parameter estimates from the zero-inflated negative binomial distribution are an unreliable indicator of zero inflation. CONCLUSIONS: Despite the existence of zero inflation in scRNA-seq counts, we recommend the generalized linear model with negative binomial count distribution, not zero-inflated, as a suitable reference model for scRNA-seq analysis.
format Online
Article
Text
id pubmed-7384222
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73842222020-07-28 Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics Choi, Kwangbom Chen, Yang Skelly, Daniel A. Churchill, Gary A. Genome Biol Research BACKGROUND: Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatment of zeros and no consensus on whether zero-inflated count distributions are necessary or even useful. While some studies assume the existence of zero inflation due to technical artifacts and attempt to impute the missing information, other recent studies argue that there is no zero inflation in scRNA-seq data. RESULTS: We apply a Bayesian model selection approach to unambiguously demonstrate zero inflation in multiple biologically realistic scRNA-seq datasets. We show that the primary causes of zero inflation are not technical but rather biological in nature. We also demonstrate that parameter estimates from the zero-inflated negative binomial distribution are an unreliable indicator of zero inflation. CONCLUSIONS: Despite the existence of zero inflation in scRNA-seq counts, we recommend the generalized linear model with negative binomial count distribution, not zero-inflated, as a suitable reference model for scRNA-seq analysis. BioMed Central 2020-07-27 /pmc/articles/PMC7384222/ /pubmed/32718323 http://dx.doi.org/10.1186/s13059-020-02103-2 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Choi, Kwangbom
Chen, Yang
Skelly, Daniel A.
Churchill, Gary A.
Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
title Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
title_full Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
title_fullStr Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
title_full_unstemmed Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
title_short Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
title_sort bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384222/
https://www.ncbi.nlm.nih.gov/pubmed/32718323
http://dx.doi.org/10.1186/s13059-020-02103-2
work_keys_str_mv AT choikwangbom bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics
AT chenyang bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics
AT skellydaniela bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics
AT churchillgarya bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics