Cargando…

Integrated analysis of multimodal single-cell data with structural similarity

Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse cluster...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Yingxin, Fu, Laiyi, Wu, Jie, Peng, Qinke, Nie, Qing, Zhang, Jing, Xie, Xiaohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9757079/
https://www.ncbi.nlm.nih.gov/pubmed/36130281
http://dx.doi.org/10.1093/nar/gkac781
_version_ 1784851754663280640
author Cao, Yingxin
Fu, Laiyi
Wu, Jie
Peng, Qinke
Nie, Qing
Zhang, Jing
Xie, Xiaohui
author_facet Cao, Yingxin
Fu, Laiyi
Wu, Jie
Peng, Qinke
Nie, Qing
Zhang, Jing
Xie, Xiaohui
author_sort Cao, Yingxin
collection PubMed
description Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios.
format Online
Article
Text
id pubmed-9757079
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97570792022-12-19 Integrated analysis of multimodal single-cell data with structural similarity Cao, Yingxin Fu, Laiyi Wu, Jie Peng, Qinke Nie, Qing Zhang, Jing Xie, Xiaohui Nucleic Acids Res Methods Online Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios. Oxford University Press 2022-09-21 /pmc/articles/PMC9757079/ /pubmed/36130281 http://dx.doi.org/10.1093/nar/gkac781 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Cao, Yingxin
Fu, Laiyi
Wu, Jie
Peng, Qinke
Nie, Qing
Zhang, Jing
Xie, Xiaohui
Integrated analysis of multimodal single-cell data with structural similarity
title Integrated analysis of multimodal single-cell data with structural similarity
title_full Integrated analysis of multimodal single-cell data with structural similarity
title_fullStr Integrated analysis of multimodal single-cell data with structural similarity
title_full_unstemmed Integrated analysis of multimodal single-cell data with structural similarity
title_short Integrated analysis of multimodal single-cell data with structural similarity
title_sort integrated analysis of multimodal single-cell data with structural similarity
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9757079/
https://www.ncbi.nlm.nih.gov/pubmed/36130281
http://dx.doi.org/10.1093/nar/gkac781
work_keys_str_mv AT caoyingxin integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity
AT fulaiyi integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity
AT wujie integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity
AT pengqinke integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity
AT nieqing integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity
AT zhangjing integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity
AT xiexiaohui integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity