Cargando…

SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data

MOTIVATION: Integrative analysis of multiple single-cell RNA-sequencing datasets allows for more comprehensive characterizations of cell types, but systematic technical differences between datasets, known as ‘batch effects’, need to be removed before integration to avoid misleading interpretation of...

Descripción completa

Detalles Bibliográficos
Autores principales: Gan, Dailin, Li, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9848058/
https://www.ncbi.nlm.nih.gov/pubmed/36548380
http://dx.doi.org/10.1093/bioinformatics/btac819
_version_ 1784871619488907264
author Gan, Dailin
Li, Jun
author_facet Gan, Dailin
Li, Jun
author_sort Gan, Dailin
collection PubMed
description MOTIVATION: Integrative analysis of multiple single-cell RNA-sequencing datasets allows for more comprehensive characterizations of cell types, but systematic technical differences between datasets, known as ‘batch effects’, need to be removed before integration to avoid misleading interpretation of the data. Although many batch-effect-removal methods have been developed, there is still a large room for improvement: most existing methods only give dimension-reduced data instead of expression data of individual genes, are based on computationally demanding models and are black-box models and thus difficult to interpret or tune. RESULTS: Here, we present a new batch-effect-removal method called SCIBER (Single-Cell Integrator and Batch Effect Remover) and study its performance on real datasets. SCIBER matches cell clusters across batches according to the overlap of their differentially expressed genes. As a simple algorithm that has better scalability to data with a large number of cells and is easy to tune, SCIBER shows comparable and sometimes better accuracy in removing batch effects on real datasets compared to the state-of-the-art methods, which are much more complicated. Moreover, SCIBER outputs expression data in the original space, that is, the expression of individual genes, which can be used directly for downstream analyses. Additionally, SCIBER is a reference-based method, which assigns one of the batches as the reference batch and keeps it untouched during the process, making it especially suitable for integrating user-generated datasets with standard reference data such as the Human Cell Atlas. AVAILABILITY AND IMPLEMENTATION: SCIBER is publicly available as an R package on CRAN: https://cran.r-project.org/web/packages/SCIBER/. A vignette is included in the CRAN R package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9848058
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98480582023-01-20 SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data Gan, Dailin Li, Jun Bioinformatics Original Paper MOTIVATION: Integrative analysis of multiple single-cell RNA-sequencing datasets allows for more comprehensive characterizations of cell types, but systematic technical differences between datasets, known as ‘batch effects’, need to be removed before integration to avoid misleading interpretation of the data. Although many batch-effect-removal methods have been developed, there is still a large room for improvement: most existing methods only give dimension-reduced data instead of expression data of individual genes, are based on computationally demanding models and are black-box models and thus difficult to interpret or tune. RESULTS: Here, we present a new batch-effect-removal method called SCIBER (Single-Cell Integrator and Batch Effect Remover) and study its performance on real datasets. SCIBER matches cell clusters across batches according to the overlap of their differentially expressed genes. As a simple algorithm that has better scalability to data with a large number of cells and is easy to tune, SCIBER shows comparable and sometimes better accuracy in removing batch effects on real datasets compared to the state-of-the-art methods, which are much more complicated. Moreover, SCIBER outputs expression data in the original space, that is, the expression of individual genes, which can be used directly for downstream analyses. Additionally, SCIBER is a reference-based method, which assigns one of the batches as the reference batch and keeps it untouched during the process, making it especially suitable for integrating user-generated datasets with standard reference data such as the Human Cell Atlas. AVAILABILITY AND IMPLEMENTATION: SCIBER is publicly available as an R package on CRAN: https://cran.r-project.org/web/packages/SCIBER/. A vignette is included in the CRAN R package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-12-22 /pmc/articles/PMC9848058/ /pubmed/36548380 http://dx.doi.org/10.1093/bioinformatics/btac819 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Gan, Dailin
Li, Jun
SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data
title SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data
title_full SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data
title_fullStr SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data
title_full_unstemmed SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data
title_short SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data
title_sort sciber: a simple method for removing batch effects from single-cell rna-sequencing data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9848058/
https://www.ncbi.nlm.nih.gov/pubmed/36548380
http://dx.doi.org/10.1093/bioinformatics/btac819
work_keys_str_mv AT gandailin sciberasimplemethodforremovingbatcheffectsfromsinglecellrnasequencingdata
AT lijun sciberasimplemethodforremovingbatcheffectsfromsinglecellrnasequencingdata