Cargando…
Integrated analysis of multimodal single-cell data with structural similarity
Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse cluster...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9757079/ https://www.ncbi.nlm.nih.gov/pubmed/36130281 http://dx.doi.org/10.1093/nar/gkac781 |
_version_ | 1784851754663280640 |
---|---|
author | Cao, Yingxin Fu, Laiyi Wu, Jie Peng, Qinke Nie, Qing Zhang, Jing Xie, Xiaohui |
author_facet | Cao, Yingxin Fu, Laiyi Wu, Jie Peng, Qinke Nie, Qing Zhang, Jing Xie, Xiaohui |
author_sort | Cao, Yingxin |
collection | PubMed |
description | Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios. |
format | Online Article Text |
id | pubmed-9757079 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-97570792022-12-19 Integrated analysis of multimodal single-cell data with structural similarity Cao, Yingxin Fu, Laiyi Wu, Jie Peng, Qinke Nie, Qing Zhang, Jing Xie, Xiaohui Nucleic Acids Res Methods Online Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios. Oxford University Press 2022-09-21 /pmc/articles/PMC9757079/ /pubmed/36130281 http://dx.doi.org/10.1093/nar/gkac781 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Cao, Yingxin Fu, Laiyi Wu, Jie Peng, Qinke Nie, Qing Zhang, Jing Xie, Xiaohui Integrated analysis of multimodal single-cell data with structural similarity |
title | Integrated analysis of multimodal single-cell data with structural similarity |
title_full | Integrated analysis of multimodal single-cell data with structural similarity |
title_fullStr | Integrated analysis of multimodal single-cell data with structural similarity |
title_full_unstemmed | Integrated analysis of multimodal single-cell data with structural similarity |
title_short | Integrated analysis of multimodal single-cell data with structural similarity |
title_sort | integrated analysis of multimodal single-cell data with structural similarity |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9757079/ https://www.ncbi.nlm.nih.gov/pubmed/36130281 http://dx.doi.org/10.1093/nar/gkac781 |
work_keys_str_mv | AT caoyingxin integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity AT fulaiyi integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity AT wujie integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity AT pengqinke integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity AT nieqing integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity AT zhangjing integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity AT xiexiaohui integratedanalysisofmultimodalsinglecelldatawithstructuralsimilarity |