Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering

Most single-cell RNA sequencing (scRNA-seq) analyses begin with cell clustering; thus, the clustering accuracy considerably impacts the validity of downstream analyses. In contrast with the abundance of clustering methods, the tools to assess the clustering accuracy are limited. We propose a new Clu...

Descripción completa

Detalles Bibliográficos
Autores principales: Fang, Jiyuan, Chan, Cliburn, Owzar, Kouros, Wang, Liuyang, Qin, Diyuan, Li, Qi-Jing, Xie, Jichun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793368/
https://www.ncbi.nlm.nih.gov/pubmed/36575517
http://dx.doi.org/10.1186/s13059-022-02825-5
_version_ 1784859838513152000
author Fang, Jiyuan
Chan, Cliburn
Owzar, Kouros
Wang, Liuyang
Qin, Diyuan
Li, Qi-Jing
Xie, Jichun
author_facet Fang, Jiyuan
Chan, Cliburn
Owzar, Kouros
Wang, Liuyang
Qin, Diyuan
Li, Qi-Jing
Xie, Jichun
author_sort Fang, Jiyuan
collection PubMed
description Most single-cell RNA sequencing (scRNA-seq) analyses begin with cell clustering; thus, the clustering accuracy considerably impacts the validity of downstream analyses. In contrast with the abundance of clustering methods, the tools to assess the clustering accuracy are limited. We propose a new Clustering Deviation Index (CDI) that measures the deviation of any clustering label set from the observed single-cell data. We conduct in silico and experimental scRNA-seq studies to show that CDI can select the optimal clustering label set. As a result, CDI also informs the optimal tuning parameters for any given clustering method and the correct number of cluster components. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02825-5.
format Online
Article
Text
id pubmed-9793368
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-97933682022-12-27 Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering Fang, Jiyuan Chan, Cliburn Owzar, Kouros Wang, Liuyang Qin, Diyuan Li, Qi-Jing Xie, Jichun Genome Biol Method Most single-cell RNA sequencing (scRNA-seq) analyses begin with cell clustering; thus, the clustering accuracy considerably impacts the validity of downstream analyses. In contrast with the abundance of clustering methods, the tools to assess the clustering accuracy are limited. We propose a new Clustering Deviation Index (CDI) that measures the deviation of any clustering label set from the observed single-cell data. We conduct in silico and experimental scRNA-seq studies to show that CDI can select the optimal clustering label set. As a result, CDI also informs the optimal tuning parameters for any given clustering method and the correct number of cluster components. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02825-5. BioMed Central 2022-12-27 /pmc/articles/PMC9793368/ /pubmed/36575517 http://dx.doi.org/10.1186/s13059-022-02825-5 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Method
Fang, Jiyuan
Chan, Cliburn
Owzar, Kouros
Wang, Liuyang
Qin, Diyuan
Li, Qi-Jing
Xie, Jichun
Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering
title Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering
title_full Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering
title_fullStr Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering
title_full_unstemmed Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering
title_short Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering
title_sort clustering deviation index (cdi): a robust and accurate internal measure for evaluating scrna-seq data clustering
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793368/
https://www.ncbi.nlm.nih.gov/pubmed/36575517
http://dx.doi.org/10.1186/s13059-022-02825-5
work_keys_str_mv AT fangjiyuan clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering
AT chancliburn clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering
AT owzarkouros clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering
AT wangliuyang clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering
AT qindiyuan clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering
AT liqijing clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering
AT xiejichun clusteringdeviationindexcdiarobustandaccurateinternalmeasureforevaluatingscrnaseqdataclustering