Cargando…
The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approac...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643784/ https://www.ncbi.nlm.nih.gov/pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644 |
_version_ | 1784826594418753536 |
---|---|
author | Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin |
author_facet | Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin |
author_sort | Brombacher, Eva |
collection | PubMed |
description | Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments. |
format | Online Article Text |
id | pubmed-9643784 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-96437842022-11-15 The performance of deep generative models for learning joint embeddings of single-cell multi-omics data Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin Front Mol Biosci Molecular Biosciences Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments. Frontiers Media S.A. 2022-10-26 /pmc/articles/PMC9643784/ /pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644 Text en Copyright © 2022 Brombacher, Hackenberg, Kreutz, Binder and Treppner. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Molecular Biosciences Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin The performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
title | The performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
title_full | The performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
title_fullStr | The performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
title_full_unstemmed | The performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
title_short | The performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
title_sort | performance of deep generative models for learning joint embeddings of single-cell multi-omics data |
topic | Molecular Biosciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643784/ https://www.ncbi.nlm.nih.gov/pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644 |
work_keys_str_mv | AT brombachereva theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT hackenbergmaren theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT kreutzclemens theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT binderharald theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT treppnermartin theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT brombachereva performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT hackenbergmaren performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT kreutzclemens performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT binderharald performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT treppnermartin performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata |