Cargando…

CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq

The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is howev...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Wenbo, Mahfouz, Ahmed, Reinders, Marcel J. T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076908/
https://www.ncbi.nlm.nih.gov/pubmed/33927748
http://dx.doi.org/10.3389/fgene.2021.644211
_version_ 1783684784921247744
author Yu, Wenbo
Mahfouz, Ahmed
Reinders, Marcel J. T.
author_facet Yu, Wenbo
Mahfouz, Ahmed
Reinders, Marcel J. T.
author_sort Yu, Wenbo
collection PubMed
description The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.
format Online
Article
Text
id pubmed-8076908
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-80769082021-04-28 CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq Yu, Wenbo Mahfouz, Ahmed Reinders, Marcel J. T. Front Genet Genetics The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters. Frontiers Media S.A. 2021-04-13 /pmc/articles/PMC8076908/ /pubmed/33927748 http://dx.doi.org/10.3389/fgene.2021.644211 Text en Copyright © 2021 Yu, Mahfouz and Reinders. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yu, Wenbo
Mahfouz, Ahmed
Reinders, Marcel J. T.
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_full CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_fullStr CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_full_unstemmed CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_short CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_sort cba: cluster-guided batch alignment for single cell rna-seq
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076908/
https://www.ncbi.nlm.nih.gov/pubmed/33927748
http://dx.doi.org/10.3389/fgene.2021.644211
work_keys_str_mv AT yuwenbo cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT mahfouzahmed cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT reindersmarceljt cbaclusterguidedbatchalignmentforsinglecellrnaseq