Cargando…

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics

Multi-omic analyses are necessary to understand the complex biological processes taking place at the tissue and cell level, but also to make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, but r...

Descripción completa

Detalles Bibliográficos
Autores principales: Makrodimitris, Stavros, Pronk, Bram, Abdelaal, Tamim, Reinders, Marcel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10685331/
https://www.ncbi.nlm.nih.gov/pubmed/38018908
http://dx.doi.org/10.1093/bib/bbad416
_version_ 1785151607668736000
author Makrodimitris, Stavros
Pronk, Bram
Abdelaal, Tamim
Reinders, Marcel
author_facet Makrodimitris, Stavros
Pronk, Bram
Abdelaal, Tamim
Reinders, Marcel
author_sort Makrodimitris, Stavros
collection PubMed
description Multi-omic analyses are necessary to understand the complex biological processes taking place at the tissue and cell level, but also to make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, but recently there has been a rise in the popularity of neural architectures that embed paired -omics into the same non-linear manifold. This work describes a head-to-head comparison of linear and non-linear joint embedding methods using both bulk and single-cell multi-modal datasets. We found that non-linear methods have a clear advantage with respect to linear ones for missing modality imputation. Performance comparisons in the downstream tasks of survival analysis for bulk tumor data and cell type classification for single-cell data lead to the following insights: First, concatenating the principal components of each modality is a competitive baseline and hard to beat if all modalities are available at test time. However, if we only have one modality available at test time, training a predictive model on the joint space of that modality can lead to performance improvements with respect to just using the unimodal principal components. Second, -omic profiles imputed by neural joint embedding methods are realistic enough to be used by a classifier trained on real data with limited performance drops. Taken together, our comparisons give hints to which joint embedding to use for which downstream task. Overall, product-of-experts performed well in most tasks and was reasonably fast, while early integration (concatenation) of modalities did quite poorly.
format Online
Article
Text
id pubmed-10685331
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106853312023-11-30 An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics Makrodimitris, Stavros Pronk, Bram Abdelaal, Tamim Reinders, Marcel Brief Bioinform Review Multi-omic analyses are necessary to understand the complex biological processes taking place at the tissue and cell level, but also to make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, but recently there has been a rise in the popularity of neural architectures that embed paired -omics into the same non-linear manifold. This work describes a head-to-head comparison of linear and non-linear joint embedding methods using both bulk and single-cell multi-modal datasets. We found that non-linear methods have a clear advantage with respect to linear ones for missing modality imputation. Performance comparisons in the downstream tasks of survival analysis for bulk tumor data and cell type classification for single-cell data lead to the following insights: First, concatenating the principal components of each modality is a competitive baseline and hard to beat if all modalities are available at test time. However, if we only have one modality available at test time, training a predictive model on the joint space of that modality can lead to performance improvements with respect to just using the unimodal principal components. Second, -omic profiles imputed by neural joint embedding methods are realistic enough to be used by a classifier trained on real data with limited performance drops. Taken together, our comparisons give hints to which joint embedding to use for which downstream task. Overall, product-of-experts performed well in most tasks and was reasonably fast, while early integration (concatenation) of modalities did quite poorly. Oxford University Press 2023-11-28 /pmc/articles/PMC10685331/ /pubmed/38018908 http://dx.doi.org/10.1093/bib/bbad416 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review
Makrodimitris, Stavros
Pronk, Bram
Abdelaal, Tamim
Reinders, Marcel
An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
title An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
title_full An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
title_fullStr An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
title_full_unstemmed An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
title_short An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
title_sort in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10685331/
https://www.ncbi.nlm.nih.gov/pubmed/38018908
http://dx.doi.org/10.1093/bib/bbad416
work_keys_str_mv AT makrodimitrisstavros anindepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT pronkbram anindepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT abdelaaltamim anindepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT reindersmarcel anindepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT makrodimitrisstavros indepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT pronkbram indepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT abdelaaltamim indepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics
AT reindersmarcel indepthcomparisonoflinearandnonlinearjointembeddingmethodsforbulkandsinglecellmultiomics