Cargando…
Benchmarking integration of single-cell differential expression
Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential express...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10030080/ https://www.ncbi.nlm.nih.gov/pubmed/36944632 http://dx.doi.org/10.1038/s41467-023-37126-3 |
_version_ | 1784910281700278272 |
---|---|
author | Nguyen, Hai C. T. Baik, Bukyung Yoon, Sora Park, Taesung Nam, Dougu |
author_facet | Nguyen, Hai C. T. Baik, Bukyung Yoon, Sora Park, Taesung Nam, Dougu |
author_sort | Nguyen, Hai C. T. |
collection | PubMed |
description | Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes. |
format | Online Article Text |
id | pubmed-10030080 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-100300802023-03-22 Benchmarking integration of single-cell differential expression Nguyen, Hai C. T. Baik, Bukyung Yoon, Sora Park, Taesung Nam, Dougu Nat Commun Article Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes. Nature Publishing Group UK 2023-03-21 /pmc/articles/PMC10030080/ /pubmed/36944632 http://dx.doi.org/10.1038/s41467-023-37126-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Nguyen, Hai C. T. Baik, Bukyung Yoon, Sora Park, Taesung Nam, Dougu Benchmarking integration of single-cell differential expression |
title | Benchmarking integration of single-cell differential expression |
title_full | Benchmarking integration of single-cell differential expression |
title_fullStr | Benchmarking integration of single-cell differential expression |
title_full_unstemmed | Benchmarking integration of single-cell differential expression |
title_short | Benchmarking integration of single-cell differential expression |
title_sort | benchmarking integration of single-cell differential expression |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10030080/ https://www.ncbi.nlm.nih.gov/pubmed/36944632 http://dx.doi.org/10.1038/s41467-023-37126-3 |
work_keys_str_mv | AT nguyenhaict benchmarkingintegrationofsinglecelldifferentialexpression AT baikbukyung benchmarkingintegrationofsinglecelldifferentialexpression AT yoonsora benchmarkingintegrationofsinglecelldifferentialexpression AT parktaesung benchmarkingintegrationofsinglecelldifferentialexpression AT namdougu benchmarkingintegrationofsinglecelldifferentialexpression |