Cargando…
Integration of single cell data by disentangled representation learning
Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8788944/ https://www.ncbi.nlm.nih.gov/pubmed/34850092 http://dx.doi.org/10.1093/nar/gkab978 |
_version_ | 1784639658199613440 |
---|---|
author | Guo, Tiantian Chen, Yang Shi, Minglei Li, Xiangyu Zhang, Michael Q |
author_facet | Guo, Tiantian Chen, Yang Shi, Minglei Li, Xiangyu Zhang, Michael Q |
author_sort | Guo, Tiantian |
collection | PubMed |
description | Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets from different laboratories or technologies due to batch effect, which are interspersed with biological variances. To overcome this problem, we have proposed Single Cell Integration by Disentangled Representation Learning (SCIDRL), a domain adaption-based method, to learn low-dimensional representations invariant to batch effect. This method can efficiently remove batch effect while retaining cell type purity. We applied it to thirteen diverse simulated and real datasets. Benchmark results show that SCIDRL outperforms other methods in most cases and exhibits excellent performances in two common situations: (i) effective integration of batch-shared rare cell types and preservation of batch-specific rare cell types; (ii) reliable integration of datasets with different cell compositions. This demonstrates SCIDRL will offer a valuable tool for researchers to decode the enigma of cell heterogeneity. |
format | Online Article Text |
id | pubmed-8788944 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-87889442022-01-26 Integration of single cell data by disentangled representation learning Guo, Tiantian Chen, Yang Shi, Minglei Li, Xiangyu Zhang, Michael Q Nucleic Acids Res Methods Online Recent developments of single cell RNA-sequencing technologies lead to the exponential growth of single cell sequencing datasets across different conditions. Combining these datasets helps to better understand cellular identity and function. However, it is challenging to integrate different datasets from different laboratories or technologies due to batch effect, which are interspersed with biological variances. To overcome this problem, we have proposed Single Cell Integration by Disentangled Representation Learning (SCIDRL), a domain adaption-based method, to learn low-dimensional representations invariant to batch effect. This method can efficiently remove batch effect while retaining cell type purity. We applied it to thirteen diverse simulated and real datasets. Benchmark results show that SCIDRL outperforms other methods in most cases and exhibits excellent performances in two common situations: (i) effective integration of batch-shared rare cell types and preservation of batch-specific rare cell types; (ii) reliable integration of datasets with different cell compositions. This demonstrates SCIDRL will offer a valuable tool for researchers to decode the enigma of cell heterogeneity. Oxford University Press 2021-11-24 /pmc/articles/PMC8788944/ /pubmed/34850092 http://dx.doi.org/10.1093/nar/gkab978 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Guo, Tiantian Chen, Yang Shi, Minglei Li, Xiangyu Zhang, Michael Q Integration of single cell data by disentangled representation learning |
title | Integration of single cell data by disentangled representation learning |
title_full | Integration of single cell data by disentangled representation learning |
title_fullStr | Integration of single cell data by disentangled representation learning |
title_full_unstemmed | Integration of single cell data by disentangled representation learning |
title_short | Integration of single cell data by disentangled representation learning |
title_sort | integration of single cell data by disentangled representation learning |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8788944/ https://www.ncbi.nlm.nih.gov/pubmed/34850092 http://dx.doi.org/10.1093/nar/gkab978 |
work_keys_str_mv | AT guotiantian integrationofsinglecelldatabydisentangledrepresentationlearning AT chenyang integrationofsinglecelldatabydisentangledrepresentationlearning AT shiminglei integrationofsinglecelldatabydisentangledrepresentationlearning AT lixiangyu integrationofsinglecelldatabydisentangledrepresentationlearning AT zhangmichaelq integrationofsinglecelldatabydisentangledrepresentationlearning |