Cargando…

Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model

Single-cell RNA-seq (scRNA-seq) is a powerful tool to measure the expression patterns of individual cells and discover heterogeneity and functional diversity among cell populations. Due to variability, it is challenging to analyze such data efficiently. Many clustering methods have been developed us...

Descripción completa

Detalles Bibliográficos
Autor principal: Liu, Zhenqiu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7927011/
https://www.ncbi.nlm.nih.gov/pubmed/33671799
http://dx.doi.org/10.3390/genes12020311
_version_ 1783659594117021696
author Liu, Zhenqiu
author_facet Liu, Zhenqiu
author_sort Liu, Zhenqiu
collection PubMed
description Single-cell RNA-seq (scRNA-seq) is a powerful tool to measure the expression patterns of individual cells and discover heterogeneity and functional diversity among cell populations. Due to variability, it is challenging to analyze such data efficiently. Many clustering methods have been developed using at least one free parameter. Different choices for free parameters may lead to substantially different visualizations and clusters. Tuning free parameters is also time consuming. Thus there is need for a simple, robust, and efficient clustering method. In this paper, we propose a new regularized Gaussian graphical clustering (RGGC) method for scRNA-seq data. RGGC is based on high-order (partial) correlations and subspace learning, and is robust over a wide-range of a regularized parameter [Formula: see text]. Therefore, we can simply set [Formula: see text] or [Formula: see text] for AIC (Akaike information criterion) or BIC (Bayesian information criterion) without cross-validation. Cell subpopulations are discovered by the Louvain community detection algorithm that determines the number of clusters automatically. There is no free parameter to be tuned with RGGC. When evaluated with simulated and benchmark scRNA-seq data sets against widely used methods, RGGC is computationally efficient and one of the top performers. It can detect inter-sample cell heterogeneity, when applied to glioblastoma scRNA-seq data.
format Online
Article
Text
id pubmed-7927011
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79270112021-03-04 Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model Liu, Zhenqiu Genes (Basel) Article Single-cell RNA-seq (scRNA-seq) is a powerful tool to measure the expression patterns of individual cells and discover heterogeneity and functional diversity among cell populations. Due to variability, it is challenging to analyze such data efficiently. Many clustering methods have been developed using at least one free parameter. Different choices for free parameters may lead to substantially different visualizations and clusters. Tuning free parameters is also time consuming. Thus there is need for a simple, robust, and efficient clustering method. In this paper, we propose a new regularized Gaussian graphical clustering (RGGC) method for scRNA-seq data. RGGC is based on high-order (partial) correlations and subspace learning, and is robust over a wide-range of a regularized parameter [Formula: see text]. Therefore, we can simply set [Formula: see text] or [Formula: see text] for AIC (Akaike information criterion) or BIC (Bayesian information criterion) without cross-validation. Cell subpopulations are discovered by the Louvain community detection algorithm that determines the number of clusters automatically. There is no free parameter to be tuned with RGGC. When evaluated with simulated and benchmark scRNA-seq data sets against widely used methods, RGGC is computationally efficient and one of the top performers. It can detect inter-sample cell heterogeneity, when applied to glioblastoma scRNA-seq data. MDPI 2021-02-22 /pmc/articles/PMC7927011/ /pubmed/33671799 http://dx.doi.org/10.3390/genes12020311 Text en © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Zhenqiu
Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
title Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
title_full Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
title_fullStr Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
title_full_unstemmed Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
title_short Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
title_sort clustering single-cell rna-seq data with regularized gaussian graphical model
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7927011/
https://www.ncbi.nlm.nih.gov/pubmed/33671799
http://dx.doi.org/10.3390/genes12020311
work_keys_str_mv AT liuzhenqiu clusteringsinglecellrnaseqdatawithregularizedgaussiangraphicalmodel