Cargando…
Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data
Clustering is a critical step in single cell-based studies. Most existing methods support unsupervised clustering without the a priori exploitation of any domain knowledge. When confronted by the high dimensionality and pervasive dropout events of scRNA-Seq data, purely unsupervised clustering metho...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7994574/ https://www.ncbi.nlm.nih.gov/pubmed/33767149 http://dx.doi.org/10.1038/s41467-021-22008-3 |
_version_ | 1783669779986382848 |
---|---|
author | Tian, Tian Zhang, Jie Lin, Xiang Wei, Zhi Hakonarson, Hakon |
author_facet | Tian, Tian Zhang, Jie Lin, Xiang Wei, Zhi Hakonarson, Hakon |
author_sort | Tian, Tian |
collection | PubMed |
description | Clustering is a critical step in single cell-based studies. Most existing methods support unsupervised clustering without the a priori exploitation of any domain knowledge. When confronted by the high dimensionality and pervasive dropout events of scRNA-Seq data, purely unsupervised clustering methods may not produce biologically interpretable clusters, which complicates cell type assignment. In such cases, the only recourse is for the user to manually and repeatedly tweak clustering parameters until acceptable clusters are found. Consequently, the path to obtaining biologically meaningful clusters can be ad hoc and laborious. Here we report a principled clustering method named scDCC, that integrates domain knowledge into the clustering step. Experiments on various scRNA-seq datasets from thousands to tens of thousands of cells show that scDCC can significantly improve clustering performance, facilitating the interpretability of clusters and downstream analyses, such as cell type assignment. |
format | Online Article Text |
id | pubmed-7994574 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-79945742021-04-16 Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data Tian, Tian Zhang, Jie Lin, Xiang Wei, Zhi Hakonarson, Hakon Nat Commun Article Clustering is a critical step in single cell-based studies. Most existing methods support unsupervised clustering without the a priori exploitation of any domain knowledge. When confronted by the high dimensionality and pervasive dropout events of scRNA-Seq data, purely unsupervised clustering methods may not produce biologically interpretable clusters, which complicates cell type assignment. In such cases, the only recourse is for the user to manually and repeatedly tweak clustering parameters until acceptable clusters are found. Consequently, the path to obtaining biologically meaningful clusters can be ad hoc and laborious. Here we report a principled clustering method named scDCC, that integrates domain knowledge into the clustering step. Experiments on various scRNA-seq datasets from thousands to tens of thousands of cells show that scDCC can significantly improve clustering performance, facilitating the interpretability of clusters and downstream analyses, such as cell type assignment. Nature Publishing Group UK 2021-03-25 /pmc/articles/PMC7994574/ /pubmed/33767149 http://dx.doi.org/10.1038/s41467-021-22008-3 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Tian, Tian Zhang, Jie Lin, Xiang Wei, Zhi Hakonarson, Hakon Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data |
title | Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data |
title_full | Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data |
title_fullStr | Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data |
title_full_unstemmed | Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data |
title_short | Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data |
title_sort | model-based deep embedding for constrained clustering analysis of single cell rna-seq data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7994574/ https://www.ncbi.nlm.nih.gov/pubmed/33767149 http://dx.doi.org/10.1038/s41467-021-22008-3 |
work_keys_str_mv | AT tiantian modelbaseddeepembeddingforconstrainedclusteringanalysisofsinglecellrnaseqdata AT zhangjie modelbaseddeepembeddingforconstrainedclusteringanalysisofsinglecellrnaseqdata AT linxiang modelbaseddeepembeddingforconstrainedclusteringanalysisofsinglecellrnaseqdata AT weizhi modelbaseddeepembeddingforconstrainedclusteringanalysisofsinglecellrnaseqdata AT hakonarsonhakon modelbaseddeepembeddingforconstrainedclusteringanalysisofsinglecellrnaseqdata |