Cargando…

The performance of deep generative models for learning joint embeddings of single-cell multi-omics data

Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approac...

Descripción completa

Detalles Bibliográficos
Autores principales:	Brombacher, Eva, Hackenberg, Maren, Kreutz, Clemens, Binder, Harald, Treppner, Martin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Molecular Biosciences
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643784/ https://www.ncbi.nlm.nih.gov/pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644

_version_	1784826594418753536
author	Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin
author_facet	Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin
author_sort	Brombacher, Eva
collection	PubMed
description	Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments.
format	Online Article Text
id	pubmed-9643784
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-96437842022-11-15 The performance of deep generative models for learning joint embeddings of single-cell multi-omics data Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin Front Mol Biosci Molecular Biosciences Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments. Frontiers Media S.A. 2022-10-26 /pmc/articles/PMC9643784/ /pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644 Text en Copyright © 2022 Brombacher, Hackenberg, Kreutz, Binder and Treppner. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Molecular Biosciences Brombacher, Eva Hackenberg, Maren Kreutz, Clemens Binder, Harald Treppner, Martin The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title	The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_full	The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_fullStr	The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_full_unstemmed	The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_short	The performance of deep generative models for learning joint embeddings of single-cell multi-omics data
title_sort	performance of deep generative models for learning joint embeddings of single-cell multi-omics data
topic	Molecular Biosciences
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9643784/ https://www.ncbi.nlm.nih.gov/pubmed/36387277 http://dx.doi.org/10.3389/fmolb.2022.962644
work_keys_str_mv	AT brombachereva theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT hackenbergmaren theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT kreutzclemens theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT binderharald theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT treppnermartin theperformanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT brombachereva performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT hackenbergmaren performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT kreutzclemens performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT binderharald performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata AT treppnermartin performanceofdeepgenerativemodelsforlearningjointembeddingsofsinglecellmultiomicsdata

The performance of deep generative models for learning joint embeddings of single-cell multi-omics data

Ejemplares similares