Cargando…

Integration of single cell data by disentangled representation learning

Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Tiantian, Chen, Yang, Shi, Minglei, Li, Xiangyu, Zhang, Michael Q
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8788944/
https://www.ncbi.nlm.nih.gov/pubmed/34850092
http://dx.doi.org/10.1093/nar/gkab978
_version_ 1784639658199613440
author Guo, Tiantian
Chen, Yang
Shi, Minglei
Li, Xiangyu
Zhang, Michael Q
author_facet Guo, Tiantian
Chen, Yang
Shi, Minglei
Li, Xiangyu
Zhang, Michael Q
author_sort Guo, Tiantian
collection PubMed
description Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets from different laboratories or technologies due to batch effect, which are interspersed with biological variances. To overcome this problem, we have proposed Single Cell Integration by Disentangled Representation Learning (SCIDRL), a domain adaption-based method, to learn low-dimensional representations invariant to batch effect. This method can efficiently remove batch effect while retaining cell type purity. We applied it to thirteen diverse simulated and real datasets. Benchmark results show that SCIDRL outperforms other methods in most cases and exhibits excellent performances in two common situations: (i) effective integration of batch-shared rare cell types and preservation of batch-specific rare cell types; (ii) reliable integration of datasets with different cell compositions. This demonstrates SCIDRL will offer a valuable tool for researchers to decode the enigma of cell heterogeneity.
format Online
Article
Text
id pubmed-8788944
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87889442022-01-26 Integration of single cell data by disentangled representation learning Guo, Tiantian Chen, Yang Shi, Minglei Li, Xiangyu Zhang, Michael Q Nucleic Acids Res Methods Online Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets from different laboratories or technologies due to batch effect, which are interspersed with biological variances. To overcome this problem, we have proposed Single Cell Integration by Disentangled Representation Learning (SCIDRL), a domain adaption-based method, to learn low-dimensional representations invariant to batch effect. This method can efficiently remove batch effect while retaining cell type purity. We applied it to thirteen diverse simulated and real datasets. Benchmark results show that SCIDRL outperforms other methods in most cases and exhibits excellent performances in two common situations: (i) effective integration of batch-shared rare cell types and preservation of batch-specific rare cell types; (ii) reliable integration of datasets with different cell compositions. This demonstrates SCIDRL will offer a valuable tool for researchers to decode the enigma of cell heterogeneity. Oxford University Press 2021-11-24 /pmc/articles/PMC8788944/ /pubmed/34850092 http://dx.doi.org/10.1093/nar/gkab978 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Guo, Tiantian
Chen, Yang
Shi, Minglei
Li, Xiangyu
Zhang, Michael Q
Integration of single cell data by disentangled representation learning
title Integration of single cell data by disentangled representation learning
title_full Integration of single cell data by disentangled representation learning
title_fullStr Integration of single cell data by disentangled representation learning
title_full_unstemmed Integration of single cell data by disentangled representation learning
title_short Integration of single cell data by disentangled representation learning
title_sort integration of single cell data by disentangled representation learning
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8788944/
https://www.ncbi.nlm.nih.gov/pubmed/34850092
http://dx.doi.org/10.1093/nar/gkab978
work_keys_str_mv AT guotiantian integrationofsinglecelldatabydisentangledrepresentationlearning
AT chenyang integrationofsinglecelldatabydisentangledrepresentationlearning
AT shiminglei integrationofsinglecelldatabydisentangledrepresentationlearning
AT lixiangyu integrationofsinglecelldatabydisentangledrepresentationlearning
AT zhangmichaelq integrationofsinglecelldatabydisentangledrepresentationlearning