Cargando…

Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations

Single-cell RNA sequencing (scRNA-seq) is a widely used technique for characterizing individual cells and studying gene expression at the single-cell level. Clustering plays a vital role in grouping similar cells together for various downstream analyses. However, the high sparsity and dimensionality...

Descripción completa

Detalles Bibliográficos
Autores principales: Lei, Tianyuan, Chen, Ruoyu, Zhang, Shaoqiang, Chen, Yong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10539043/
https://www.ncbi.nlm.nih.gov/pubmed/37769630
http://dx.doi.org/10.1093/bib/bbad335
_version_ 1785113416314126336
author Lei, Tianyuan
Chen, Ruoyu
Zhang, Shaoqiang
Chen, Yong
author_facet Lei, Tianyuan
Chen, Ruoyu
Zhang, Shaoqiang
Chen, Yong
author_sort Lei, Tianyuan
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) is a widely used technique for characterizing individual cells and studying gene expression at the single-cell level. Clustering plays a vital role in grouping similar cells together for various downstream analyses. However, the high sparsity and dimensionality of large scRNA-seq data pose challenges to clustering performance. Although several deep learning-based clustering algorithms have been proposed, most existing clustering methods have limitations in capturing the precise distribution types of the data or fully utilizing the relationships between cells, leaving a considerable scope for improving the clustering performance, particularly in detecting rare cell populations from large scRNA-seq data. We introduce DeepScena, a novel single-cell hierarchical clustering tool that fully incorporates nonlinear dimension reduction, negative binomial-based convolutional autoencoder for data fitting, and a self-supervision model for cell similarity enhancement. In comprehensive evaluation using multiple large-scale scRNA-seq datasets, DeepScena consistently outperformed seven popular clustering tools in terms of accuracy. Notably, DeepScena exhibits high proficiency in identifying rare cell populations within large datasets that contain large numbers of clusters. When applied to scRNA-seq data of multiple myeloma cells, DeepScena successfully identified not only previously labeled large cell types but also subpopulations in CD14 monocytes, T cells and natural killer cells, respectively.
format Online
Article
Text
id pubmed-10539043
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105390432023-09-29 Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations Lei, Tianyuan Chen, Ruoyu Zhang, Shaoqiang Chen, Yong Brief Bioinform Problem Solving Protocol Single-cell RNA sequencing (scRNA-seq) is a widely used technique for characterizing individual cells and studying gene expression at the single-cell level. Clustering plays a vital role in grouping similar cells together for various downstream analyses. However, the high sparsity and dimensionality of large scRNA-seq data pose challenges to clustering performance. Although several deep learning-based clustering algorithms have been proposed, most existing clustering methods have limitations in capturing the precise distribution types of the data or fully utilizing the relationships between cells, leaving a considerable scope for improving the clustering performance, particularly in detecting rare cell populations from large scRNA-seq data. We introduce DeepScena, a novel single-cell hierarchical clustering tool that fully incorporates nonlinear dimension reduction, negative binomial-based convolutional autoencoder for data fitting, and a self-supervision model for cell similarity enhancement. In comprehensive evaluation using multiple large-scale scRNA-seq datasets, DeepScena consistently outperformed seven popular clustering tools in terms of accuracy. Notably, DeepScena exhibits high proficiency in identifying rare cell populations within large datasets that contain large numbers of clusters. When applied to scRNA-seq data of multiple myeloma cells, DeepScena successfully identified not only previously labeled large cell types but also subpopulations in CD14 monocytes, T cells and natural killer cells, respectively. Oxford University Press 2023-09-28 /pmc/articles/PMC10539043/ /pubmed/37769630 http://dx.doi.org/10.1093/bib/bbad335 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Problem Solving Protocol
Lei, Tianyuan
Chen, Ruoyu
Zhang, Shaoqiang
Chen, Yong
Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations
title Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations
title_full Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations
title_fullStr Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations
title_full_unstemmed Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations
title_short Self-supervised deep clustering of single-cell RNA-seq data to hierarchically detect rare cell populations
title_sort self-supervised deep clustering of single-cell rna-seq data to hierarchically detect rare cell populations
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10539043/
https://www.ncbi.nlm.nih.gov/pubmed/37769630
http://dx.doi.org/10.1093/bib/bbad335
work_keys_str_mv AT leitianyuan selfsuperviseddeepclusteringofsinglecellrnaseqdatatohierarchicallydetectrarecellpopulations
AT chenruoyu selfsuperviseddeepclusteringofsinglecellrnaseqdatatohierarchicallydetectrarecellpopulations
AT zhangshaoqiang selfsuperviseddeepclusteringofsinglecellrnaseqdatatohierarchicallydetectrarecellpopulations
AT chenyong selfsuperviseddeepclusteringofsinglecellrnaseqdatatohierarchicallydetectrarecellpopulations