Cargando…
Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data
Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the cor...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450184/ https://www.ncbi.nlm.nih.gov/pubmed/37403780 http://dx.doi.org/10.1093/nar/gkad570 |
_version_ | 1785095142146834432 |
---|---|
author | Wang, Ruo Han Wang, Jianping Li, Shuai Cheng |
author_facet | Wang, Ruo Han Wang, Jianping Li, Shuai Cheng |
author_sort | Wang, Ruo Han |
collection | PubMed |
description | Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the correspondences among the features. Consequently, we propose a probabilistic tensor decomposition framework, SCOIT, to extract embeddings from single-cell multiomic data. SCOIT incorporates various distributions, including Gaussian, Poisson, and negative binomial distributions, to deal with sparse, noisy, and heterogeneous single-cell data. Our framework can decompose a multiomic tensor into a cell embedding matrix, a gene embedding matrix, and an omic embedding matrix, allowing for various downstream analyses. We applied SCOIT to eight single-cell multiomic datasets from different sequencing protocols. With cell embeddings, SCOIT achieves superior performance for cell clustering compared to nine state-of-the-art tools under various metrics, demonstrating its ability to dissect cellular heterogeneity. With the gene embeddings, SCOIT enables cross-omics gene expression analysis and integrative gene regulatory network study. Furthermore, the embeddings allow cross-omics imputation simultaneously, outperforming current imputation methods with the Pearson correlation coefficient increased by 3.38–39.26%; moreover, SCOIT accommodates the scenario that subsets of the cells are with merely one omic profile available. |
format | Online Article Text |
id | pubmed-10450184 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-104501842023-08-26 Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data Wang, Ruo Han Wang, Jianping Li, Shuai Cheng Nucleic Acids Res Methods Online Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the correspondences among the features. Consequently, we propose a probabilistic tensor decomposition framework, SCOIT, to extract embeddings from single-cell multiomic data. SCOIT incorporates various distributions, including Gaussian, Poisson, and negative binomial distributions, to deal with sparse, noisy, and heterogeneous single-cell data. Our framework can decompose a multiomic tensor into a cell embedding matrix, a gene embedding matrix, and an omic embedding matrix, allowing for various downstream analyses. We applied SCOIT to eight single-cell multiomic datasets from different sequencing protocols. With cell embeddings, SCOIT achieves superior performance for cell clustering compared to nine state-of-the-art tools under various metrics, demonstrating its ability to dissect cellular heterogeneity. With the gene embeddings, SCOIT enables cross-omics gene expression analysis and integrative gene regulatory network study. Furthermore, the embeddings allow cross-omics imputation simultaneously, outperforming current imputation methods with the Pearson correlation coefficient increased by 3.38–39.26%; moreover, SCOIT accommodates the scenario that subsets of the cells are with merely one omic profile available. Oxford University Press 2023-07-05 /pmc/articles/PMC10450184/ /pubmed/37403780 http://dx.doi.org/10.1093/nar/gkad570 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Wang, Ruo Han Wang, Jianping Li, Shuai Cheng Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
title | Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
title_full | Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
title_fullStr | Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
title_full_unstemmed | Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
title_short | Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
title_sort | probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450184/ https://www.ncbi.nlm.nih.gov/pubmed/37403780 http://dx.doi.org/10.1093/nar/gkad570 |
work_keys_str_mv | AT wangruohan probabilistictensordecompositionextractsbetterlatentembeddingsfromsinglecellmultiomicdata AT wangjianping probabilistictensordecompositionextractsbetterlatentembeddingsfromsinglecellmultiomicdata AT lishuaicheng probabilistictensordecompositionextractsbetterlatentembeddingsfromsinglecellmultiomicdata |