Cargando…
Jointly Embedding Multiple Single-Cell Omics Measurements
Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliqu...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496402/ https://www.ncbi.nlm.nih.gov/pubmed/34632462 http://dx.doi.org/10.4230/LIPIcs.WABI.2019.10 |
_version_ | 1784579750911541248 |
---|---|
author | Liu, Jie Huang, Yuanhao Singh, Ritambhara Vert, Jean-Philippe Noble, William Stafford |
author_facet | Liu, Jie Huang, Yuanhao Singh, Ritambhara Vert, Jean-Philippe Noble, William Stafford |
author_sort | Liu, Jie |
collection | PubMed |
description | Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliquots of a given population of cells. Effectively, MMD-MA performs an in silico co-assay by embedding cells measured in different ways into a learned latent space. In the MMD-MA algorithm, single-cell data points from multiple domains are aligned by optimizing an objective function with three components: (1) a maximum mean discrepancy (MMD) term to encourage the differently measured points to have similar distributions in the latent space, (2) a distortion term to preserve the structure of the data between the input space and the latent space, and (3) a penalty term to avoid collapse to a trivial solution. Notably, MMD-MA does not require any correspondence information across data modalities, either between the cells or between the features. Furthermore, MMD-MA’s weak distributional requirements for the domains to be aligned allow the algorithm to integrate heterogeneous types of single cell measures, such as gene expression, DNA accessibility, chromatin organization, methylation, and imaging data. We demonstrate the utility of MMD-MA in simulation experiments and using a real data set involving single-cell gene expression and methylation data. |
format | Online Article Text |
id | pubmed-8496402 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
record_format | MEDLINE/PubMed |
spelling | pubmed-84964022021-10-07 Jointly Embedding Multiple Single-Cell Omics Measurements Liu, Jie Huang, Yuanhao Singh, Ritambhara Vert, Jean-Philippe Noble, William Stafford Algorithms Bioinform Article Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliquots of a given population of cells. Effectively, MMD-MA performs an in silico co-assay by embedding cells measured in different ways into a learned latent space. In the MMD-MA algorithm, single-cell data points from multiple domains are aligned by optimizing an objective function with three components: (1) a maximum mean discrepancy (MMD) term to encourage the differently measured points to have similar distributions in the latent space, (2) a distortion term to preserve the structure of the data between the input space and the latent space, and (3) a penalty term to avoid collapse to a trivial solution. Notably, MMD-MA does not require any correspondence information across data modalities, either between the cells or between the features. Furthermore, MMD-MA’s weak distributional requirements for the domains to be aligned allow the algorithm to integrate heterogeneous types of single cell measures, such as gene expression, DNA accessibility, chromatin organization, methylation, and imaging data. We demonstrate the utility of MMD-MA in simulation experiments and using a real data set involving single-cell gene expression and methylation data. 2019-09-03 /pmc/articles/PMC8496402/ /pubmed/34632462 http://dx.doi.org/10.4230/LIPIcs.WABI.2019.10 Text en https://creativecommons.org/licenses/by/4.0/licensed under Creative Commons License CC-BY 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). |
spellingShingle | Article Liu, Jie Huang, Yuanhao Singh, Ritambhara Vert, Jean-Philippe Noble, William Stafford Jointly Embedding Multiple Single-Cell Omics Measurements |
title | Jointly Embedding Multiple Single-Cell Omics Measurements |
title_full | Jointly Embedding Multiple Single-Cell Omics Measurements |
title_fullStr | Jointly Embedding Multiple Single-Cell Omics Measurements |
title_full_unstemmed | Jointly Embedding Multiple Single-Cell Omics Measurements |
title_short | Jointly Embedding Multiple Single-Cell Omics Measurements |
title_sort | jointly embedding multiple single-cell omics measurements |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496402/ https://www.ncbi.nlm.nih.gov/pubmed/34632462 http://dx.doi.org/10.4230/LIPIcs.WABI.2019.10 |
work_keys_str_mv | AT liujie jointlyembeddingmultiplesinglecellomicsmeasurements AT huangyuanhao jointlyembeddingmultiplesinglecellomicsmeasurements AT singhritambhara jointlyembeddingmultiplesinglecellomicsmeasurements AT vertjeanphilippe jointlyembeddingmultiplesinglecellomicsmeasurements AT noblewilliamstafford jointlyembeddingmultiplesinglecellomicsmeasurements |