Cargando…

Benchmarking integration of single-cell differential expression

Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential express...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Hai C. T., Baik, Bukyung, Yoon, Sora, Park, Taesung, Nam, Dougu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10030080/
https://www.ncbi.nlm.nih.gov/pubmed/36944632
http://dx.doi.org/10.1038/s41467-023-37126-3
_version_ 1784910281700278272
author Nguyen, Hai C. T.
Baik, Bukyung
Yoon, Sora
Park, Taesung
Nam, Dougu
author_facet Nguyen, Hai C. T.
Baik, Bukyung
Yoon, Sora
Park, Taesung
Nam, Dougu
author_sort Nguyen, Hai C. T.
collection PubMed
description Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.
format Online
Article
Text
id pubmed-10030080
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-100300802023-03-22 Benchmarking integration of single-cell differential expression Nguyen, Hai C. T. Baik, Bukyung Yoon, Sora Park, Taesung Nam, Dougu Nat Commun Article Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes. Nature Publishing Group UK 2023-03-21 /pmc/articles/PMC10030080/ /pubmed/36944632 http://dx.doi.org/10.1038/s41467-023-37126-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Nguyen, Hai C. T.
Baik, Bukyung
Yoon, Sora
Park, Taesung
Nam, Dougu
Benchmarking integration of single-cell differential expression
title Benchmarking integration of single-cell differential expression
title_full Benchmarking integration of single-cell differential expression
title_fullStr Benchmarking integration of single-cell differential expression
title_full_unstemmed Benchmarking integration of single-cell differential expression
title_short Benchmarking integration of single-cell differential expression
title_sort benchmarking integration of single-cell differential expression
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10030080/
https://www.ncbi.nlm.nih.gov/pubmed/36944632
http://dx.doi.org/10.1038/s41467-023-37126-3
work_keys_str_mv AT nguyenhaict benchmarkingintegrationofsinglecelldifferentialexpression
AT baikbukyung benchmarkingintegrationofsinglecelldifferentialexpression
AT yoonsora benchmarkingintegrationofsinglecelldifferentialexpression
AT parktaesung benchmarkingintegrationofsinglecelldifferentialexpression
AT namdougu benchmarkingintegrationofsinglecelldifferentialexpression