Cargando…

Clustering CITE-seq data with a canonical correlation-based deep learning method

Single-cell multiomics sequencing techniques have rapidly developed in the past few years. Among these techniques, single-cell cellular indexing of transcriptomes and epitopes (CITE-seq) allows simultaneous quantification of gene expression and surface proteins. Clustering CITE-seq data have the gre...

Descripción completa

Detalles Bibliográficos
Autores principales: Yuan, Musu, Chen, Liang, Deng, Minghua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9441595/
https://www.ncbi.nlm.nih.gov/pubmed/36072672
http://dx.doi.org/10.3389/fgene.2022.977968
_version_ 1784782614056402944
author Yuan, Musu
Chen, Liang
Deng, Minghua
author_facet Yuan, Musu
Chen, Liang
Deng, Minghua
author_sort Yuan, Musu
collection PubMed
description Single-cell multiomics sequencing techniques have rapidly developed in the past few years. Among these techniques, single-cell cellular indexing of transcriptomes and epitopes (CITE-seq) allows simultaneous quantification of gene expression and surface proteins. Clustering CITE-seq data have the great potential of providing us with a more comprehensive and in-depth view of cell states and interactions. However, CITE-seq data inherit the properties of scRNA-seq data, being noisy, large-dimensional, and highly sparse. Moreover, representations of RNA and surface protein are sometimes with low correlation and contribute divergently to the clustering object. To overcome these obstacles and find a combined representation well suited for clustering, we proposed scCTClust for multiomics data, especially CITE-seq data, and clustering analysis. Two omics-specific neural networks are introduced to extract cluster information from omics data. A deep canonical correlation method is adopted to find the maximumly correlated representations of two omics. A novel decentralized clustering method is utilized over the linear combination of latent representations of two omics. The fusion weights which can account for contributions of omics to clustering are adaptively updated during training. Extensive experiments over both simulated and real CITE-seq data sets demonstrated the power of scCTClust. We also applied scCTClust on transcriptome–epigenome data to illustrate its potential for generalizing.
format Online
Article
Text
id pubmed-9441595
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-94415952022-09-06 Clustering CITE-seq data with a canonical correlation-based deep learning method Yuan, Musu Chen, Liang Deng, Minghua Front Genet Genetics Single-cell multiomics sequencing techniques have rapidly developed in the past few years. Among these techniques, single-cell cellular indexing of transcriptomes and epitopes (CITE-seq) allows simultaneous quantification of gene expression and surface proteins. Clustering CITE-seq data have the great potential of providing us with a more comprehensive and in-depth view of cell states and interactions. However, CITE-seq data inherit the properties of scRNA-seq data, being noisy, large-dimensional, and highly sparse. Moreover, representations of RNA and surface protein are sometimes with low correlation and contribute divergently to the clustering object. To overcome these obstacles and find a combined representation well suited for clustering, we proposed scCTClust for multiomics data, especially CITE-seq data, and clustering analysis. Two omics-specific neural networks are introduced to extract cluster information from omics data. A deep canonical correlation method is adopted to find the maximumly correlated representations of two omics. A novel decentralized clustering method is utilized over the linear combination of latent representations of two omics. The fusion weights which can account for contributions of omics to clustering are adaptively updated during training. Extensive experiments over both simulated and real CITE-seq data sets demonstrated the power of scCTClust. We also applied scCTClust on transcriptome–epigenome data to illustrate its potential for generalizing. Frontiers Media S.A. 2022-08-22 /pmc/articles/PMC9441595/ /pubmed/36072672 http://dx.doi.org/10.3389/fgene.2022.977968 Text en Copyright © 2022 Yuan, Chen and Deng. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yuan, Musu
Chen, Liang
Deng, Minghua
Clustering CITE-seq data with a canonical correlation-based deep learning method
title Clustering CITE-seq data with a canonical correlation-based deep learning method
title_full Clustering CITE-seq data with a canonical correlation-based deep learning method
title_fullStr Clustering CITE-seq data with a canonical correlation-based deep learning method
title_full_unstemmed Clustering CITE-seq data with a canonical correlation-based deep learning method
title_short Clustering CITE-seq data with a canonical correlation-based deep learning method
title_sort clustering cite-seq data with a canonical correlation-based deep learning method
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9441595/
https://www.ncbi.nlm.nih.gov/pubmed/36072672
http://dx.doi.org/10.3389/fgene.2022.977968
work_keys_str_mv AT yuanmusu clusteringciteseqdatawithacanonicalcorrelationbaseddeeplearningmethod
AT chenliang clusteringciteseqdatawithacanonicalcorrelationbaseddeeplearningmethod
AT dengminghua clusteringciteseqdatawithacanonicalcorrelationbaseddeeplearningmethod