Cargando…

Clustering single-cell multi-omics data with MoClust

MOTIVATION: Single-cell multi-omics sequencing techniques have rapidly developed in the past few years. Clustering analysis with single-cell multi-omics data may give us novel perspectives to dissect cellular heterogeneity. However, multi-omics data have the properties of inherited large dimension,...

Descripción completa

Detalles Bibliográficos
Autores principales: Yuan, Musu, Chen, Liang, Deng, Minghua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805570/
https://www.ncbi.nlm.nih.gov/pubmed/36383167
http://dx.doi.org/10.1093/bioinformatics/btac736
_version_ 1784862355698483200
author Yuan, Musu
Chen, Liang
Deng, Minghua
author_facet Yuan, Musu
Chen, Liang
Deng, Minghua
author_sort Yuan, Musu
collection PubMed
description MOTIVATION: Single-cell multi-omics sequencing techniques have rapidly developed in the past few years. Clustering analysis with single-cell multi-omics data may give us novel perspectives to dissect cellular heterogeneity. However, multi-omics data have the properties of inherited large dimension, high sparsity and existence of doublets. Moreover, representations of different omics from even the same cell follow diverse distributions. Without proper distribution alignment techniques, clustering methods will encounter less separable clusters easily affected by less informative omics data. RESULTS: We developed MoClust, a novel joint clustering framework that can be applied to several types of single-cell multi-omics data. A selective automatic doublet detection module that can identify and filter out doublets is introduced in the pretraining stage to improve data quality. Omics-specific autoencoders are introduced to characterize the multi-omics data. A contrastive learning way of distribution alignment is adopted to adaptively fuse omics representations into an omics-invariant representation. This novel way of alignment boosts the compactness and separableness of clusters, while accurately weighting the contribution of each omics to the clustering object. Extensive experiments, over both simulated and real multi-omics datasets, demonstrated the powerful alignment, doublet detection and clustering ability features of MoClust. AVAILABILITY AND IMPLEMENTATION: An implementation of MoClust is available from https://doi.org/10.5281/zenodo.7306504. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9805570
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98055702023-01-03 Clustering single-cell multi-omics data with MoClust Yuan, Musu Chen, Liang Deng, Minghua Bioinformatics Original Paper MOTIVATION: Single-cell multi-omics sequencing techniques have rapidly developed in the past few years. Clustering analysis with single-cell multi-omics data may give us novel perspectives to dissect cellular heterogeneity. However, multi-omics data have the properties of inherited large dimension, high sparsity and existence of doublets. Moreover, representations of different omics from even the same cell follow diverse distributions. Without proper distribution alignment techniques, clustering methods will encounter less separable clusters easily affected by less informative omics data. RESULTS: We developed MoClust, a novel joint clustering framework that can be applied to several types of single-cell multi-omics data. A selective automatic doublet detection module that can identify and filter out doublets is introduced in the pretraining stage to improve data quality. Omics-specific autoencoders are introduced to characterize the multi-omics data. A contrastive learning way of distribution alignment is adopted to adaptively fuse omics representations into an omics-invariant representation. This novel way of alignment boosts the compactness and separableness of clusters, while accurately weighting the contribution of each omics to the clustering object. Extensive experiments, over both simulated and real multi-omics datasets, demonstrated the powerful alignment, doublet detection and clustering ability features of MoClust. AVAILABILITY AND IMPLEMENTATION: An implementation of MoClust is available from https://doi.org/10.5281/zenodo.7306504. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-11-16 /pmc/articles/PMC9805570/ /pubmed/36383167 http://dx.doi.org/10.1093/bioinformatics/btac736 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Yuan, Musu
Chen, Liang
Deng, Minghua
Clustering single-cell multi-omics data with MoClust
title Clustering single-cell multi-omics data with MoClust
title_full Clustering single-cell multi-omics data with MoClust
title_fullStr Clustering single-cell multi-omics data with MoClust
title_full_unstemmed Clustering single-cell multi-omics data with MoClust
title_short Clustering single-cell multi-omics data with MoClust
title_sort clustering single-cell multi-omics data with moclust
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805570/
https://www.ncbi.nlm.nih.gov/pubmed/36383167
http://dx.doi.org/10.1093/bioinformatics/btac736
work_keys_str_mv AT yuanmusu clusteringsinglecellmultiomicsdatawithmoclust
AT chenliang clusteringsinglecellmultiomicsdatawithmoclust
AT dengminghua clusteringsinglecellmultiomicsdatawithmoclust