Cargando…
scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data
BACKGROUND: Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both cl...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8042883/ https://www.ncbi.nlm.nih.gov/pubmed/33845760 http://dx.doi.org/10.1186/s12859-021-04028-4 |
_version_ | 1783678209107165184 |
---|---|
author | Ranjan, Bobby Schmidt, Florian Sun, Wenjie Park, Jinyu Honardoost, Mohammad Amin Tan, Joanna Arul Rayan, Nirmala Prabhakar, Shyam |
author_facet | Ranjan, Bobby Schmidt, Florian Sun, Wenjie Park, Jinyu Honardoost, Mohammad Amin Tan, Joanna Arul Rayan, Nirmala Prabhakar, Shyam |
author_sort | Ranjan, Bobby |
collection | PubMed |
description | BACKGROUND: Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. RESULTS: We present scConsensus, an [Formula: see text] framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. CONCLUSIONS: scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in [Formula: see text] and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04028-4. |
format | Online Article Text |
id | pubmed-8042883 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-80428832021-04-14 scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data Ranjan, Bobby Schmidt, Florian Sun, Wenjie Park, Jinyu Honardoost, Mohammad Amin Tan, Joanna Arul Rayan, Nirmala Prabhakar, Shyam BMC Bioinformatics Methodology Article BACKGROUND: Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. RESULTS: We present scConsensus, an [Formula: see text] framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. CONCLUSIONS: scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in [Formula: see text] and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04028-4. BioMed Central 2021-04-12 /pmc/articles/PMC8042883/ /pubmed/33845760 http://dx.doi.org/10.1186/s12859-021-04028-4 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Ranjan, Bobby Schmidt, Florian Sun, Wenjie Park, Jinyu Honardoost, Mohammad Amin Tan, Joanna Arul Rayan, Nirmala Prabhakar, Shyam scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title | scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_full | scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_fullStr | scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_full_unstemmed | scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_short | scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_sort | scconsensus: combining supervised and unsupervised clustering for cell type identification in single-cell rna sequencing data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8042883/ https://www.ncbi.nlm.nih.gov/pubmed/33845760 http://dx.doi.org/10.1186/s12859-021-04028-4 |
work_keys_str_mv | AT ranjanbobby scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT schmidtflorian scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT sunwenjie scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT parkjinyu scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT honardoostmohammadamin scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT tanjoanna scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT arulrayannirmala scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT prabhakarshyam scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata |