Cargando…

A benchmark of batch-effect correction methods for single-cell RNA sequencing data

BACKGROUND: Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integrati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tran, Hoa Thi Nhu, Ang, Kok Siong, Chevrier, Marion, Zhang, Xiaomeng, Lee, Nicole Yee Shin, Goh, Michelle, Chen, Jinmiao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6964114/ https://www.ncbi.nlm.nih.gov/pubmed/31948481 http://dx.doi.org/10.1186/s13059-019-1850-9

_version_	1783488437604581376
author	Tran, Hoa Thi Nhu Ang, Kok Siong Chevrier, Marion Zhang, Xiaomeng Lee, Nicole Yee Shin Goh, Michelle Chen, Jinmiao
author_facet	Tran, Hoa Thi Nhu Ang, Kok Siong Chevrier, Marion Zhang, Xiaomeng Lee, Nicole Yee Shin Goh, Michelle Chen, Jinmiao
author_sort	Tran, Hoa Thi Nhu
collection	PubMed
description	BACKGROUND: Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal. RESULTS: We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression. CONCLUSION: Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.
format	Online Article Text
id	pubmed-6964114
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-69641142020-01-22 A benchmark of batch-effect correction methods for single-cell RNA sequencing data Tran, Hoa Thi Nhu Ang, Kok Siong Chevrier, Marion Zhang, Xiaomeng Lee, Nicole Yee Shin Goh, Michelle Chen, Jinmiao Genome Biol Research BACKGROUND: Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal. RESULTS: We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression. CONCLUSION: Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives. BioMed Central 2020-01-16 /pmc/articles/PMC6964114/ /pubmed/31948481 http://dx.doi.org/10.1186/s13059-019-1850-9 Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Tran, Hoa Thi Nhu Ang, Kok Siong Chevrier, Marion Zhang, Xiaomeng Lee, Nicole Yee Shin Goh, Michelle Chen, Jinmiao A benchmark of batch-effect correction methods for single-cell RNA sequencing data
title	A benchmark of batch-effect correction methods for single-cell RNA sequencing data
title_full	A benchmark of batch-effect correction methods for single-cell RNA sequencing data
title_fullStr	A benchmark of batch-effect correction methods for single-cell RNA sequencing data
title_full_unstemmed	A benchmark of batch-effect correction methods for single-cell RNA sequencing data
title_short	A benchmark of batch-effect correction methods for single-cell RNA sequencing data
title_sort	benchmark of batch-effect correction methods for single-cell rna sequencing data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6964114/ https://www.ncbi.nlm.nih.gov/pubmed/31948481 http://dx.doi.org/10.1186/s13059-019-1850-9
work_keys_str_mv	AT tranhoathinhu abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT angkoksiong abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT chevriermarion abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT zhangxiaomeng abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT leenicoleyeeshin abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT gohmichelle abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT chenjinmiao abenchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT tranhoathinhu benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT angkoksiong benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT chevriermarion benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT zhangxiaomeng benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT leenicoleyeeshin benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT gohmichelle benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata AT chenjinmiao benchmarkofbatcheffectcorrectionmethodsforsinglecellrnasequencingdata

A benchmark of batch-effect correction methods for single-cell RNA sequencing data

Ejemplares similares