Cargando…

A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data

Single-cell RNA sequencing (scRNA-seq) has recently brought new insight into cell differentiation processes and functional variation in cell subtypes from homogeneous cell populations. A lack of prior knowledge makes unsupervised machine learning methods, such as clustering, suitable for analyzing s...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Xiaoshu, Li, Hong-Dong, Xu, Yunpei, Guo, Lilu, Wu, Fang-Xiang, Duan, Guihua, Wang, Jianxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409843/
https://www.ncbi.nlm.nih.gov/pubmed/30700040
http://dx.doi.org/10.3390/genes10020098
_version_ 1783402083274194944
author Zhu, Xiaoshu
Li, Hong-Dong
Xu, Yunpei
Guo, Lilu
Wu, Fang-Xiang
Duan, Guihua
Wang, Jianxin
author_facet Zhu, Xiaoshu
Li, Hong-Dong
Xu, Yunpei
Guo, Lilu
Wu, Fang-Xiang
Duan, Guihua
Wang, Jianxin
author_sort Zhu, Xiaoshu
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) has recently brought new insight into cell differentiation processes and functional variation in cell subtypes from homogeneous cell populations. A lack of prior knowledge makes unsupervised machine learning methods, such as clustering, suitable for analyzing scRNA-seq. However, there are several limitations to overcome, including high dimensionality, clustering result instability, and parameter adjustment complexity. In this study, we propose a method by combining structure entropy and k nearest neighbor to identify cell subpopulations in scRNA-seq data. In contrast to existing clustering methods for identifying cell subtypes, minimized structure entropy results in natural communities without specifying the number of clusters. To investigate the performance of our model, we applied it to eight scRNA-seq datasets and compared our method with three existing methods (nonnegative matrix factorization, single-cell interpretation via multikernel learning, and structural entropy minimization principle). The experimental results showed that our approach achieves, on average, better performance in these datasets compared to the benchmark methods.
format Online
Article
Text
id pubmed-6409843
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64098432019-03-26 A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data Zhu, Xiaoshu Li, Hong-Dong Xu, Yunpei Guo, Lilu Wu, Fang-Xiang Duan, Guihua Wang, Jianxin Genes (Basel) Article Single-cell RNA sequencing (scRNA-seq) has recently brought new insight into cell differentiation processes and functional variation in cell subtypes from homogeneous cell populations. A lack of prior knowledge makes unsupervised machine learning methods, such as clustering, suitable for analyzing scRNA-seq. However, there are several limitations to overcome, including high dimensionality, clustering result instability, and parameter adjustment complexity. In this study, we propose a method by combining structure entropy and k nearest neighbor to identify cell subpopulations in scRNA-seq data. In contrast to existing clustering methods for identifying cell subtypes, minimized structure entropy results in natural communities without specifying the number of clusters. To investigate the performance of our model, we applied it to eight scRNA-seq datasets and compared our method with three existing methods (nonnegative matrix factorization, single-cell interpretation via multikernel learning, and structural entropy minimization principle). The experimental results showed that our approach achieves, on average, better performance in these datasets compared to the benchmark methods. MDPI 2019-01-29 /pmc/articles/PMC6409843/ /pubmed/30700040 http://dx.doi.org/10.3390/genes10020098 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhu, Xiaoshu
Li, Hong-Dong
Xu, Yunpei
Guo, Lilu
Wu, Fang-Xiang
Duan, Guihua
Wang, Jianxin
A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
title A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
title_full A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
title_fullStr A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
title_full_unstemmed A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
title_short A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
title_sort hybrid clustering algorithm for identifying cell types from single-cell rna-seq data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409843/
https://www.ncbi.nlm.nih.gov/pubmed/30700040
http://dx.doi.org/10.3390/genes10020098
work_keys_str_mv AT zhuxiaoshu ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT lihongdong ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT xuyunpei ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT guolilu ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT wufangxiang ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT duanguihua ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT wangjianxin ahybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT zhuxiaoshu hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT lihongdong hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT xuyunpei hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT guolilu hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT wufangxiang hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT duanguihua hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata
AT wangjianxin hybridclusteringalgorithmforidentifyingcelltypesfromsinglecellrnaseqdata