Cargando…

Jointly Embedding Multiple Single-Cell Omics Measurements

Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliqu...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Jie, Huang, Yuanhao, Singh, Ritambhara, Vert, Jean-Philippe, Noble, William Stafford
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496402/
https://www.ncbi.nlm.nih.gov/pubmed/34632462
http://dx.doi.org/10.4230/LIPIcs.WABI.2019.10
_version_ 1784579750911541248
author Liu, Jie
Huang, Yuanhao
Singh, Ritambhara
Vert, Jean-Philippe
Noble, William Stafford
author_facet Liu, Jie
Huang, Yuanhao
Singh, Ritambhara
Vert, Jean-Philippe
Noble, William Stafford
author_sort Liu, Jie
collection PubMed
description Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliquots of a given population of cells. Effectively, MMD-MA performs an in silico co-assay by embedding cells measured in different ways into a learned latent space. In the MMD-MA algorithm, single-cell data points from multiple domains are aligned by optimizing an objective function with three components: (1) a maximum mean discrepancy (MMD) term to encourage the differently measured points to have similar distributions in the latent space, (2) a distortion term to preserve the structure of the data between the input space and the latent space, and (3) a penalty term to avoid collapse to a trivial solution. Notably, MMD-MA does not require any correspondence information across data modalities, either between the cells or between the features. Furthermore, MMD-MA’s weak distributional requirements for the domains to be aligned allow the algorithm to integrate heterogeneous types of single cell measures, such as gene expression, DNA accessibility, chromatin organization, methylation, and imaging data. We demonstrate the utility of MMD-MA in simulation experiments and using a real data set involving single-cell gene expression and methylation data.
format Online
Article
Text
id pubmed-8496402
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-84964022021-10-07 Jointly Embedding Multiple Single-Cell Omics Measurements Liu, Jie Huang, Yuanhao Singh, Ritambhara Vert, Jean-Philippe Noble, William Stafford Algorithms Bioinform Article Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliquots of a given population of cells. Effectively, MMD-MA performs an in silico co-assay by embedding cells measured in different ways into a learned latent space. In the MMD-MA algorithm, single-cell data points from multiple domains are aligned by optimizing an objective function with three components: (1) a maximum mean discrepancy (MMD) term to encourage the differently measured points to have similar distributions in the latent space, (2) a distortion term to preserve the structure of the data between the input space and the latent space, and (3) a penalty term to avoid collapse to a trivial solution. Notably, MMD-MA does not require any correspondence information across data modalities, either between the cells or between the features. Furthermore, MMD-MA’s weak distributional requirements for the domains to be aligned allow the algorithm to integrate heterogeneous types of single cell measures, such as gene expression, DNA accessibility, chromatin organization, methylation, and imaging data. We demonstrate the utility of MMD-MA in simulation experiments and using a real data set involving single-cell gene expression and methylation data. 2019-09-03 /pmc/articles/PMC8496402/ /pubmed/34632462 http://dx.doi.org/10.4230/LIPIcs.WABI.2019.10 Text en https://creativecommons.org/licenses/by/4.0/licensed under Creative Commons License CC-BY 19th International Workshop on Algorithms in Bioinformatics (WABI 2019).
spellingShingle Article
Liu, Jie
Huang, Yuanhao
Singh, Ritambhara
Vert, Jean-Philippe
Noble, William Stafford
Jointly Embedding Multiple Single-Cell Omics Measurements
title Jointly Embedding Multiple Single-Cell Omics Measurements
title_full Jointly Embedding Multiple Single-Cell Omics Measurements
title_fullStr Jointly Embedding Multiple Single-Cell Omics Measurements
title_full_unstemmed Jointly Embedding Multiple Single-Cell Omics Measurements
title_short Jointly Embedding Multiple Single-Cell Omics Measurements
title_sort jointly embedding multiple single-cell omics measurements
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496402/
https://www.ncbi.nlm.nih.gov/pubmed/34632462
http://dx.doi.org/10.4230/LIPIcs.WABI.2019.10
work_keys_str_mv AT liujie jointlyembeddingmultiplesinglecellomicsmeasurements
AT huangyuanhao jointlyembeddingmultiplesinglecellomicsmeasurements
AT singhritambhara jointlyembeddingmultiplesinglecellomicsmeasurements
AT vertjeanphilippe jointlyembeddingmultiplesinglecellomicsmeasurements
AT noblewilliamstafford jointlyembeddingmultiplesinglecellomicsmeasurements