Cargando…
Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space
Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-le...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9574176/ https://www.ncbi.nlm.nih.gov/pubmed/36253379 http://dx.doi.org/10.1038/s41467-022-33758-z |
_version_ | 1784811048232026112 |
---|---|
author | Xiong, Lei Tian, Kang Li, Yuzhe Ning, Weixi Gao, Xin Zhang, Qiangfeng Cliff |
author_facet | Xiong, Lei Tian, Kang Li, Yuzhe Ning, Weixi Gao, Xin Zhang, Qiangfeng Cliff |
author_sort | Xiong, Lei |
collection | PubMed |
description | Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-learning method that integrates single-cell data by projecting cells into a batch-invariant, common cell-embedding space in a truly online manner (i.e., without retraining the model). SCALEX substantially outperforms online iNMF and other state-of-the-art non-online integration methods on benchmark single-cell datasets of diverse modalities, (e.g., single-cell RNA sequencing, scRNA-seq, single-cell assay for transposase-accessible chromatin use sequencing, scATAC-seq), especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We showcase SCALEX’s advantages by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19 patients, each assembled from diverse data sources and growing with every new data. The online data integration capacity and superior performance makes SCALEX particularly appropriate for large-scale single-cell applications to build upon previous scientific insights. |
format | Online Article Text |
id | pubmed-9574176 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-95741762022-10-17 Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space Xiong, Lei Tian, Kang Li, Yuzhe Ning, Weixi Gao, Xin Zhang, Qiangfeng Cliff Nat Commun Article Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-learning method that integrates single-cell data by projecting cells into a batch-invariant, common cell-embedding space in a truly online manner (i.e., without retraining the model). SCALEX substantially outperforms online iNMF and other state-of-the-art non-online integration methods on benchmark single-cell datasets of diverse modalities, (e.g., single-cell RNA sequencing, scRNA-seq, single-cell assay for transposase-accessible chromatin use sequencing, scATAC-seq), especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We showcase SCALEX’s advantages by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19 patients, each assembled from diverse data sources and growing with every new data. The online data integration capacity and superior performance makes SCALEX particularly appropriate for large-scale single-cell applications to build upon previous scientific insights. Nature Publishing Group UK 2022-10-17 /pmc/articles/PMC9574176/ /pubmed/36253379 http://dx.doi.org/10.1038/s41467-022-33758-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Xiong, Lei Tian, Kang Li, Yuzhe Ning, Weixi Gao, Xin Zhang, Qiangfeng Cliff Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
title | Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
title_full | Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
title_fullStr | Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
title_full_unstemmed | Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
title_short | Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
title_sort | online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9574176/ https://www.ncbi.nlm.nih.gov/pubmed/36253379 http://dx.doi.org/10.1038/s41467-022-33758-z |
work_keys_str_mv | AT xionglei onlinesinglecelldataintegrationthroughprojectingheterogeneousdatasetsintoacommoncellembeddingspace AT tiankang onlinesinglecelldataintegrationthroughprojectingheterogeneousdatasetsintoacommoncellembeddingspace AT liyuzhe onlinesinglecelldataintegrationthroughprojectingheterogeneousdatasetsintoacommoncellembeddingspace AT ningweixi onlinesinglecelldataintegrationthroughprojectingheterogeneousdatasetsintoacommoncellembeddingspace AT gaoxin onlinesinglecelldataintegrationthroughprojectingheterogeneousdatasetsintoacommoncellembeddingspace AT zhangqiangfengcliff onlinesinglecelldataintegrationthroughprojectingheterogeneousdatasetsintoacommoncellembeddingspace |