Cargando…

Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA

Single-cell technologies generate large, high-dimensional datasets encompassing a diversity of omics. Dimensionality reduction captures the structure and heterogeneity of the original dataset, creating low-dimensional visualizations that contribute to the human understanding of data. Existing algori...

Descripción completa

Detalles Bibliográficos
Autores principales: Amouzgar, Meelad, Glass, David R., Baskar, Reema, Averbukh, Inna, Kimmey, Samuel C., Tsai, Albert G., Hartmann, Felix J., Bendall, Sean C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9403402/
https://www.ncbi.nlm.nih.gov/pubmed/36033591
http://dx.doi.org/10.1016/j.patter.2022.100536
_version_ 1784773368130568192
author Amouzgar, Meelad
Glass, David R.
Baskar, Reema
Averbukh, Inna
Kimmey, Samuel C.
Tsai, Albert G.
Hartmann, Felix J.
Bendall, Sean C.
author_facet Amouzgar, Meelad
Glass, David R.
Baskar, Reema
Averbukh, Inna
Kimmey, Samuel C.
Tsai, Albert G.
Hartmann, Felix J.
Bendall, Sean C.
author_sort Amouzgar, Meelad
collection PubMed
description Single-cell technologies generate large, high-dimensional datasets encompassing a diversity of omics. Dimensionality reduction captures the structure and heterogeneity of the original dataset, creating low-dimensional visualizations that contribute to the human understanding of data. Existing algorithms are typically unsupervised, using measured features to generate manifolds, disregarding known biological labels such as cell type or experimental time point. We repurpose the classification algorithm, linear discriminant analysis (LDA), for supervised dimensionality reduction of single-cell data. LDA identifies linear combinations of predictors that optimally separate a priori classes, enabling the study of specific aspects of cellular heterogeneity. We implement feature selection by hybrid subset selection (HSS) and demonstrate that this computationally efficient approach generates non-stochastic, interpretable axes amenable to diverse biological processes such as differentiation over time and cell cycle. We benchmark HSS-LDA against several popular dimensionality-reduction algorithms and illustrate its utility and versatility for the exploration of single-cell mass cytometry, transcriptomics, and chromatin accessibility data.
format Online
Article
Text
id pubmed-9403402
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-94034022022-08-26 Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA Amouzgar, Meelad Glass, David R. Baskar, Reema Averbukh, Inna Kimmey, Samuel C. Tsai, Albert G. Hartmann, Felix J. Bendall, Sean C. Patterns (N Y) Article Single-cell technologies generate large, high-dimensional datasets encompassing a diversity of omics. Dimensionality reduction captures the structure and heterogeneity of the original dataset, creating low-dimensional visualizations that contribute to the human understanding of data. Existing algorithms are typically unsupervised, using measured features to generate manifolds, disregarding known biological labels such as cell type or experimental time point. We repurpose the classification algorithm, linear discriminant analysis (LDA), for supervised dimensionality reduction of single-cell data. LDA identifies linear combinations of predictors that optimally separate a priori classes, enabling the study of specific aspects of cellular heterogeneity. We implement feature selection by hybrid subset selection (HSS) and demonstrate that this computationally efficient approach generates non-stochastic, interpretable axes amenable to diverse biological processes such as differentiation over time and cell cycle. We benchmark HSS-LDA against several popular dimensionality-reduction algorithms and illustrate its utility and versatility for the exploration of single-cell mass cytometry, transcriptomics, and chromatin accessibility data. Elsevier 2022-06-24 /pmc/articles/PMC9403402/ /pubmed/36033591 http://dx.doi.org/10.1016/j.patter.2022.100536 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Amouzgar, Meelad
Glass, David R.
Baskar, Reema
Averbukh, Inna
Kimmey, Samuel C.
Tsai, Albert G.
Hartmann, Felix J.
Bendall, Sean C.
Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA
title Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA
title_full Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA
title_fullStr Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA
title_full_unstemmed Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA
title_short Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA
title_sort supervised dimensionality reduction for exploration of single-cell data by hss-lda
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9403402/
https://www.ncbi.nlm.nih.gov/pubmed/36033591
http://dx.doi.org/10.1016/j.patter.2022.100536
work_keys_str_mv AT amouzgarmeelad superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT glassdavidr superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT baskarreema superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT averbukhinna superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT kimmeysamuelc superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT tsaialbertg superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT hartmannfelixj superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda
AT bendallseanc superviseddimensionalityreductionforexplorationofsinglecelldatabyhsslda