Cargando…

The performance of deep generative models for learning joint embeddings of single-cell multi-omics data

Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approac...

Descripción completa

Detalles Bibliográficos
Autores principales: Brombacher, Eva, Hackenberg, Maren, Kreutz, Clemens, Binder, Harald, Treppner, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643784/
https://www.ncbi.nlm.nih.gov/pubmed/36387277
http://dx.doi.org/10.3389/fmolb.2022.962644
_version_ 1784826594418753536
author Brombacher, Eva
Hackenberg, Maren
Kreutz, Clemens
Binder, Harald
Treppner, Martin
author_facet Brombacher, Eva
Hackenberg, Maren
Kreutz, Clemens
Binder, Harald
Treppner, Martin
author_sort Brombacher, Eva
collection PubMed
description Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments.
format Online
Article
Text
id pubmed-9643784
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96437842022-11-15 The performance of deep generative models for learning joint embeddings of single-cell multi-omics data Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin Front Mol Biosci Molecular Biosciences Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments. Frontiers Media S.A. 2022-10-26 /pmc/articles/PMC9643784/ /pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644 Text en Copyright © 2022 Brombacher, Hackenberg, Kreutz, Binder and Treppner. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Brombacher, Eva
Hackenberg, Maren
Kreutz, Clemens
Binder, Harald
Treppner, Martin
The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_full The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_fullStr The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_full_unstemmed The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_short The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_sort performance of deep generative models for learning joint embeddings of single-cell multi-omics data
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643784/
https://www.ncbi.nlm.nih.gov/pubmed/36387277
http://dx.doi.org/10.3389/fmolb.2022.962644
work_keys_str_mv AT brombachereva theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT hackenbergmaren theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT kreutzclemens theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT binderharald theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT treppnermartin theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT brombachereva performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT hackenbergmaren performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT kreutzclemens performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT binderharald performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata
AT treppnermartin performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata