Cargando…

Single Cell Self-Paced Clustering with Transcriptome Sequencing Data

Single cell RNA sequencing (scRNA-seq) allows researchers to explore tissue heterogeneity, distinguish unusual cell identities, and find novel cellular subtypes by providing transcriptome profiling for individual cells. Clustering analysis is usually used to predict cell class assignments and infer...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Peng, Xu, Zenglin, Chen, Junjie, Ren, Yazhou, King, Irwin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8999118/
https://www.ncbi.nlm.nih.gov/pubmed/35409258
http://dx.doi.org/10.3390/ijms23073900
_version_ 1784685114682245120
author Zhao, Peng
Xu, Zenglin
Chen, Junjie
Ren, Yazhou
King, Irwin
author_facet Zhao, Peng
Xu, Zenglin
Chen, Junjie
Ren, Yazhou
King, Irwin
author_sort Zhao, Peng
collection PubMed
description Single cell RNA sequencing (scRNA-seq) allows researchers to explore tissue heterogeneity, distinguish unusual cell identities, and find novel cellular subtypes by providing transcriptome profiling for individual cells. Clustering analysis is usually used to predict cell class assignments and infer cell identities. However, the performance of existing single-cell clustering methods is extremely sensitive to the presence of noise data and outliers. Existing clustering algorithms can easily fall into local optimal solutions. There is still no consensus on the best performing method. To address this issue, we introduce a single cell self-paced clustering (scSPaC) method with F-norm based nonnegative matrix factorization (NMF) for scRNA-seq data and a sparse single cell self-paced clustering (sscSPaC) method with [Formula: see text]-norm based nonnegative matrix factorization for scRNA-seq data. We gradually add single cells from simple to complex to our model until all cells are selected. In this way, the influences of noisy data and outliers can be significantly reduced. The proposed method achieved the best performance on both simulation data and real scRNA-seq data. A case study about human clara cells and ependymal cells scRNA-seq data clustering shows that scSPaC is more advantageous near the clustering dividing line.
format Online
Article
Text
id pubmed-8999118
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89991182022-04-12 Single Cell Self-Paced Clustering with Transcriptome Sequencing Data Zhao, Peng Xu, Zenglin Chen, Junjie Ren, Yazhou King, Irwin Int J Mol Sci Article Single cell RNA sequencing (scRNA-seq) allows researchers to explore tissue heterogeneity, distinguish unusual cell identities, and find novel cellular subtypes by providing transcriptome profiling for individual cells. Clustering analysis is usually used to predict cell class assignments and infer cell identities. However, the performance of existing single-cell clustering methods is extremely sensitive to the presence of noise data and outliers. Existing clustering algorithms can easily fall into local optimal solutions. There is still no consensus on the best performing method. To address this issue, we introduce a single cell self-paced clustering (scSPaC) method with F-norm based nonnegative matrix factorization (NMF) for scRNA-seq data and a sparse single cell self-paced clustering (sscSPaC) method with [Formula: see text]-norm based nonnegative matrix factorization for scRNA-seq data. We gradually add single cells from simple to complex to our model until all cells are selected. In this way, the influences of noisy data and outliers can be significantly reduced. The proposed method achieved the best performance on both simulation data and real scRNA-seq data. A case study about human clara cells and ependymal cells scRNA-seq data clustering shows that scSPaC is more advantageous near the clustering dividing line. MDPI 2022-03-31 /pmc/articles/PMC8999118/ /pubmed/35409258 http://dx.doi.org/10.3390/ijms23073900 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhao, Peng
Xu, Zenglin
Chen, Junjie
Ren, Yazhou
King, Irwin
Single Cell Self-Paced Clustering with Transcriptome Sequencing Data
title Single Cell Self-Paced Clustering with Transcriptome Sequencing Data
title_full Single Cell Self-Paced Clustering with Transcriptome Sequencing Data
title_fullStr Single Cell Self-Paced Clustering with Transcriptome Sequencing Data
title_full_unstemmed Single Cell Self-Paced Clustering with Transcriptome Sequencing Data
title_short Single Cell Self-Paced Clustering with Transcriptome Sequencing Data
title_sort single cell self-paced clustering with transcriptome sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8999118/
https://www.ncbi.nlm.nih.gov/pubmed/35409258
http://dx.doi.org/10.3390/ijms23073900
work_keys_str_mv AT zhaopeng singlecellselfpacedclusteringwithtranscriptomesequencingdata
AT xuzenglin singlecellselfpacedclusteringwithtranscriptomesequencingdata
AT chenjunjie singlecellselfpacedclusteringwithtranscriptomesequencingdata
AT renyazhou singlecellselfpacedclusteringwithtranscriptomesequencingdata
AT kingirwin singlecellselfpacedclusteringwithtranscriptomesequencingdata