Cargando…
Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
BACKGROUND: Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatme...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384222/ https://www.ncbi.nlm.nih.gov/pubmed/32718323 http://dx.doi.org/10.1186/s13059-020-02103-2 |
_version_ | 1783563584287014912 |
---|---|
author | Choi, Kwangbom Chen, Yang Skelly, Daniel A. Churchill, Gary A. |
author_facet | Choi, Kwangbom Chen, Yang Skelly, Daniel A. Churchill, Gary A. |
author_sort | Choi, Kwangbom |
collection | PubMed |
description | BACKGROUND: Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatment of zeros and no consensus on whether zero-inflated count distributions are necessary or even useful. While some studies assume the existence of zero inflation due to technical artifacts and attempt to impute the missing information, other recent studies argue that there is no zero inflation in scRNA-seq data. RESULTS: We apply a Bayesian model selection approach to unambiguously demonstrate zero inflation in multiple biologically realistic scRNA-seq datasets. We show that the primary causes of zero inflation are not technical but rather biological in nature. We also demonstrate that parameter estimates from the zero-inflated negative binomial distribution are an unreliable indicator of zero inflation. CONCLUSIONS: Despite the existence of zero inflation in scRNA-seq counts, we recommend the generalized linear model with negative binomial count distribution, not zero-inflated, as a suitable reference model for scRNA-seq analysis. |
format | Online Article Text |
id | pubmed-7384222 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73842222020-07-28 Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics Choi, Kwangbom Chen, Yang Skelly, Daniel A. Churchill, Gary A. Genome Biol Research BACKGROUND: Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatment of zeros and no consensus on whether zero-inflated count distributions are necessary or even useful. While some studies assume the existence of zero inflation due to technical artifacts and attempt to impute the missing information, other recent studies argue that there is no zero inflation in scRNA-seq data. RESULTS: We apply a Bayesian model selection approach to unambiguously demonstrate zero inflation in multiple biologically realistic scRNA-seq datasets. We show that the primary causes of zero inflation are not technical but rather biological in nature. We also demonstrate that parameter estimates from the zero-inflated negative binomial distribution are an unreliable indicator of zero inflation. CONCLUSIONS: Despite the existence of zero inflation in scRNA-seq counts, we recommend the generalized linear model with negative binomial count distribution, not zero-inflated, as a suitable reference model for scRNA-seq analysis. BioMed Central 2020-07-27 /pmc/articles/PMC7384222/ /pubmed/32718323 http://dx.doi.org/10.1186/s13059-020-02103-2 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Choi, Kwangbom Chen, Yang Skelly, Daniel A. Churchill, Gary A. Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
title | Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
title_full | Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
title_fullStr | Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
title_full_unstemmed | Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
title_short | Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
title_sort | bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384222/ https://www.ncbi.nlm.nih.gov/pubmed/32718323 http://dx.doi.org/10.1186/s13059-020-02103-2 |
work_keys_str_mv | AT choikwangbom bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics AT chenyang bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics AT skellydaniela bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics AT churchillgarya bayesianmodelselectionrevealsbiologicaloriginsofzeroinflationinsinglecelltranscriptomics |