Cargando…
Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models
Dimensionality reduction is a critical step in the analysis of single-cell RNA-seq data. The standard approach is to apply a transformation to the count matrix, followed by principal components analysis. However, this approach can spuriously indicate heterogeneity where it does not exist and mask tr...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168202/ https://www.ncbi.nlm.nih.gov/pubmed/37162914 http://dx.doi.org/10.1101/2023.04.21.537881 |
_version_ | 1785038815994314752 |
---|---|
author | Nicol, Phillip B. Miller, Jeffrey W. |
author_facet | Nicol, Phillip B. Miller, Jeffrey W. |
author_sort | Nicol, Phillip B. |
collection | PubMed |
description | Dimensionality reduction is a critical step in the analysis of single-cell RNA-seq data. The standard approach is to apply a transformation to the count matrix, followed by principal components analysis. However, this approach can spuriously indicate heterogeneity where it does not exist and mask true heterogeneity where it does exist. An alternative approach is to directly model the counts, but existing model-based methods tend to be computationally intractable on large datasets and do not quantify uncertainty in the low-dimensional representation. To address these problems, we develop scGBM, a novel method for model-based dimensionality reduction of single-cell RNA-seq data. scGBM employs a scalable algorithm to fit a Poisson bilinear model to datasets with millions of cells and quantifies the uncertainty in each cell’s latent position. Furthermore, scGBM leverages these uncertainties to assess the confidence associated with a given cell clustering. On real and simulated single-cell data, we find that scGBM produces low-dimensional embeddings that better capture relevant biological information while removing unwanted variation. scGBM is publicly available as an R package. |
format | Online Article Text |
id | pubmed-10168202 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-101682022023-05-10 Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models Nicol, Phillip B. Miller, Jeffrey W. bioRxiv Article Dimensionality reduction is a critical step in the analysis of single-cell RNA-seq data. The standard approach is to apply a transformation to the count matrix, followed by principal components analysis. However, this approach can spuriously indicate heterogeneity where it does not exist and mask true heterogeneity where it does exist. An alternative approach is to directly model the counts, but existing model-based methods tend to be computationally intractable on large datasets and do not quantify uncertainty in the low-dimensional representation. To address these problems, we develop scGBM, a novel method for model-based dimensionality reduction of single-cell RNA-seq data. scGBM employs a scalable algorithm to fit a Poisson bilinear model to datasets with millions of cells and quantifies the uncertainty in each cell’s latent position. Furthermore, scGBM leverages these uncertainties to assess the confidence associated with a given cell clustering. On real and simulated single-cell data, we find that scGBM produces low-dimensional embeddings that better capture relevant biological information while removing unwanted variation. scGBM is publicly available as an R package. Cold Spring Harbor Laboratory 2023-04-25 /pmc/articles/PMC10168202/ /pubmed/37162914 http://dx.doi.org/10.1101/2023.04.21.537881 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Nicol, Phillip B. Miller, Jeffrey W. Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models |
title | Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models |
title_full | Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models |
title_fullStr | Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models |
title_full_unstemmed | Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models |
title_short | Model-based dimensionality reduction for single-cell RNA-seq using generalized bilinear models |
title_sort | model-based dimensionality reduction for single-cell rna-seq using generalized bilinear models |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168202/ https://www.ncbi.nlm.nih.gov/pubmed/37162914 http://dx.doi.org/10.1101/2023.04.21.537881 |
work_keys_str_mv | AT nicolphillipb modelbaseddimensionalityreductionforsinglecellrnasequsinggeneralizedbilinearmodels AT millerjeffreyw modelbaseddimensionalityreductionforsinglecellrnasequsinggeneralizedbilinearmodels |