Cargando…

Boosting scRNA-seq data clustering by cluster-aware feature weighting

BACKGROUND: The rapid development of single-cell RNA sequencing (scRNA-seq) enables the exploration of cell heterogeneity, which is usually done by scRNA-seq data clustering. The essence of scRNA-seq data clustering is to group cells by measuring the similarities among genes/transcripts of cells. An...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Rui-Yi, Guan, Jihong, Zhou, Shuigeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8171019/
https://www.ncbi.nlm.nih.gov/pubmed/34078287
http://dx.doi.org/10.1186/s12859-021-04033-7
_version_ 1783702351722315776
author Li, Rui-Yi
Guan, Jihong
Zhou, Shuigeng
author_facet Li, Rui-Yi
Guan, Jihong
Zhou, Shuigeng
author_sort Li, Rui-Yi
collection PubMed
description BACKGROUND: The rapid development of single-cell RNA sequencing (scRNA-seq) enables the exploration of cell heterogeneity, which is usually done by scRNA-seq data clustering. The essence of scRNA-seq data clustering is to group cells by measuring the similarities among genes/transcripts of cells. And the selection of features for cell similarity evaluation is of great importance, which will significantly impact clustering effectiveness and efficiency. RESULTS: In this paper, we propose a novel method called CaFew to select genes based on cluster-aware feature weighting. By optimizing the clustering objective function, CaFew obtains a feature weight matrix, which is further used for feature selection. The genes have large weights in at least one cluster or the genes whose weights vary greatly in different clusters are selected. Experiments on 8 real scRNA-seq datasets show that CaFew can obviously improve the clustering performance of existing scRNA-seq data clustering methods. Particularly, the combination of CaFew with SC3 achieves the state-of-art performance. Furthermore, CaFew also benefits the visualization of scRNA-seq data. CONCLUSION: CaFew is an effective scRNA-seq data clustering method due to its gene selection mechanism based on cluster-aware feature weighting, and it is a useful tool for scRNA-seq data analysis.
format Online
Article
Text
id pubmed-8171019
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81710192021-06-03 Boosting scRNA-seq data clustering by cluster-aware feature weighting Li, Rui-Yi Guan, Jihong Zhou, Shuigeng BMC Bioinformatics Research BACKGROUND: The rapid development of single-cell RNA sequencing (scRNA-seq) enables the exploration of cell heterogeneity, which is usually done by scRNA-seq data clustering. The essence of scRNA-seq data clustering is to group cells by measuring the similarities among genes/transcripts of cells. And the selection of features for cell similarity evaluation is of great importance, which will significantly impact clustering effectiveness and efficiency. RESULTS: In this paper, we propose a novel method called CaFew to select genes based on cluster-aware feature weighting. By optimizing the clustering objective function, CaFew obtains a feature weight matrix, which is further used for feature selection. The genes have large weights in at least one cluster or the genes whose weights vary greatly in different clusters are selected. Experiments on 8 real scRNA-seq datasets show that CaFew can obviously improve the clustering performance of existing scRNA-seq data clustering methods. Particularly, the combination of CaFew with SC3 achieves the state-of-art performance. Furthermore, CaFew also benefits the visualization of scRNA-seq data. CONCLUSION: CaFew is an effective scRNA-seq data clustering method due to its gene selection mechanism based on cluster-aware feature weighting, and it is a useful tool for scRNA-seq data analysis. BioMed Central 2021-06-02 /pmc/articles/PMC8171019/ /pubmed/34078287 http://dx.doi.org/10.1186/s12859-021-04033-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Li, Rui-Yi
Guan, Jihong
Zhou, Shuigeng
Boosting scRNA-seq data clustering by cluster-aware feature weighting
title Boosting scRNA-seq data clustering by cluster-aware feature weighting
title_full Boosting scRNA-seq data clustering by cluster-aware feature weighting
title_fullStr Boosting scRNA-seq data clustering by cluster-aware feature weighting
title_full_unstemmed Boosting scRNA-seq data clustering by cluster-aware feature weighting
title_short Boosting scRNA-seq data clustering by cluster-aware feature weighting
title_sort boosting scrna-seq data clustering by cluster-aware feature weighting
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8171019/
https://www.ncbi.nlm.nih.gov/pubmed/34078287
http://dx.doi.org/10.1186/s12859-021-04033-7
work_keys_str_mv AT liruiyi boostingscrnaseqdataclusteringbyclusterawarefeatureweighting
AT guanjihong boostingscrnaseqdataclusteringbyclusterawarefeatureweighting
AT zhoushuigeng boostingscrnaseqdataclusteringbyclusterawarefeatureweighting