Cargando…

scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) methods have been advantageous for quantifying cell-to-cell variation by profiling the transcriptomes of individual cells. For scRNA-seq data, variability in gene expression reflects the degree of variation in gene expression from one cell to anothe...

Descripción completa

Detalles Bibliográficos
Autores principales: Dharmaratne, Malindrie, Kulkarni, Ameya S, Taherian Fard, Atefeh, Mar, Jessica C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9871437/
https://www.ncbi.nlm.nih.gov/pubmed/36691728
http://dx.doi.org/10.1093/gigascience/giac126
_version_ 1784877172001865728
author Dharmaratne, Malindrie
Kulkarni, Ameya S
Taherian Fard, Atefeh
Mar, Jessica C
author_facet Dharmaratne, Malindrie
Kulkarni, Ameya S
Taherian Fard, Atefeh
Mar, Jessica C
author_sort Dharmaratne, Malindrie
collection PubMed
description BACKGROUND: Single-cell RNA sequencing (scRNA-seq) methods have been advantageous for quantifying cell-to-cell variation by profiling the transcriptomes of individual cells. For scRNA-seq data, variability in gene expression reflects the degree of variation in gene expression from one cell to another. Analyses that focus on cell–cell variability therefore are useful for going beyond changes based on average expression and, instead, identifying genes with homogeneous expression versus those that vary widely from cell to cell. RESULTS: We present a novel statistical framework, scShapes, for identifying differential distributions in single-cell RNA-sequencing data using generalized linear models. Most approaches for differential gene expression detect shifts in the mean value. However, as single-cell data are driven by overdispersion and dropouts, moving beyond means and using distributions that can handle excess zeros is critical. scShapes quantifies gene-specific cell-to-cell variability by testing for differences in the expression distribution while flexibly adjusting for covariates if required. We demonstrate that scShapes identifies subtle variations that are independent of altered mean expression and detects biologically relevant genes that were not discovered through standard approaches. CONCLUSIONS: This analysis also draws attention to genes that switch distribution shapes from a unimodal distribution to a zero-inflated distribution and raises open questions about the plausible biological mechanisms that may give rise to this, such as transcriptional bursting. Overall, the results from scShapes help to expand our understanding of the role that gene expression plays in the transcriptional regulation of a specific perturbation or cellular phenotype. Our framework scShapes is incorporated into a Bioconductor R package (https://www.bioconductor.org/packages/release/bioc/html/scShapes.html).
format Online
Article
Text
id pubmed-9871437
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98714372023-01-31 scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data Dharmaratne, Malindrie Kulkarni, Ameya S Taherian Fard, Atefeh Mar, Jessica C Gigascience Research BACKGROUND: Single-cell RNA sequencing (scRNA-seq) methods have been advantageous for quantifying cell-to-cell variation by profiling the transcriptomes of individual cells. For scRNA-seq data, variability in gene expression reflects the degree of variation in gene expression from one cell to another. Analyses that focus on cell–cell variability therefore are useful for going beyond changes based on average expression and, instead, identifying genes with homogeneous expression versus those that vary widely from cell to cell. RESULTS: We present a novel statistical framework, scShapes, for identifying differential distributions in single-cell RNA-sequencing data using generalized linear models. Most approaches for differential gene expression detect shifts in the mean value. However, as single-cell data are driven by overdispersion and dropouts, moving beyond means and using distributions that can handle excess zeros is critical. scShapes quantifies gene-specific cell-to-cell variability by testing for differences in the expression distribution while flexibly adjusting for covariates if required. We demonstrate that scShapes identifies subtle variations that are independent of altered mean expression and detects biologically relevant genes that were not discovered through standard approaches. CONCLUSIONS: This analysis also draws attention to genes that switch distribution shapes from a unimodal distribution to a zero-inflated distribution and raises open questions about the plausible biological mechanisms that may give rise to this, such as transcriptional bursting. Overall, the results from scShapes help to expand our understanding of the role that gene expression plays in the transcriptional regulation of a specific perturbation or cellular phenotype. Our framework scShapes is incorporated into a Bioconductor R package (https://www.bioconductor.org/packages/release/bioc/html/scShapes.html). Oxford University Press 2023-01-24 /pmc/articles/PMC9871437/ /pubmed/36691728 http://dx.doi.org/10.1093/gigascience/giac126 Text en © The Author(s) 2023. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Dharmaratne, Malindrie
Kulkarni, Ameya S
Taherian Fard, Atefeh
Mar, Jessica C
scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data
title scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data
title_full scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data
title_fullStr scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data
title_full_unstemmed scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data
title_short scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data
title_sort scshapes: a statistical framework for identifying distribution shapes in single-cell rna-sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9871437/
https://www.ncbi.nlm.nih.gov/pubmed/36691728
http://dx.doi.org/10.1093/gigascience/giac126
work_keys_str_mv AT dharmaratnemalindrie scshapesastatisticalframeworkforidentifyingdistributionshapesinsinglecellrnasequencingdata
AT kulkarniameyas scshapesastatisticalframeworkforidentifyingdistributionshapesinsinglecellrnasequencingdata
AT taherianfardatefeh scshapesastatisticalframeworkforidentifyingdistributionshapesinsinglecellrnasequencingdata
AT marjessicac scshapesastatisticalframeworkforidentifyingdistributionshapesinsinglecellrnasequencingdata