Cargando…

Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking

To guide analysts to select the right tool and parameters in differential gene expression analyses of single-cell RNA sequencing (scRNA-seq) data, we developed a novel simulator that recapitulates the data characteristics of real scRNA-seq datasets while accounting for all the relevant sources of va...

Descripción completa

Detalles Bibliográficos
Autores principales: Gagnon, Jake, Pi, Lira, Ryals, Matthew, Wan, Qingwen, Hu, Wenxing, Ouyang, Zhengyu, Zhang, Baohong, Li, Kejie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9225332/
https://www.ncbi.nlm.nih.gov/pubmed/35743881
http://dx.doi.org/10.3390/life12060850
_version_ 1784733592325193728
author Gagnon, Jake
Pi, Lira
Ryals, Matthew
Wan, Qingwen
Hu, Wenxing
Ouyang, Zhengyu
Zhang, Baohong
Li, Kejie
author_facet Gagnon, Jake
Pi, Lira
Ryals, Matthew
Wan, Qingwen
Hu, Wenxing
Ouyang, Zhengyu
Zhang, Baohong
Li, Kejie
author_sort Gagnon, Jake
collection PubMed
description To guide analysts to select the right tool and parameters in differential gene expression analyses of single-cell RNA sequencing (scRNA-seq) data, we developed a novel simulator that recapitulates the data characteristics of real scRNA-seq datasets while accounting for all the relevant sources of variation in a multi-subject, multi-condition scRNA-seq experiment: the cell-to-cell variation within a subject, the variation across subjects, the variability across cell types, the mean/variance relationship of gene expression across genes, library size effects, group effects, and covariate effects. By applying it to benchmark 12 differential gene expression analysis methods (including cell-level and pseudo-bulk methods) on simulated multi-condition, multi-subject data of the 10x Genomics platform, we demonstrated that methods originating from the negative binomial mixed model such as glmmTMB and NEBULA-HL outperformed other methods. Utilizing NEBULA-HL in a statistical analysis pipeline for single-cell analysis will enable scientists to better understand the cell-type-specific transcriptomic response to disease or treatment effects and to discover new drug targets. Further, application to two real datasets showed the outperformance of our differential expression (DE) pipeline, with unified findings of differentially expressed genes (DEG) and a pseudo-time trajectory transcriptomic result. In the end, we made recommendations for filtering strategies of cells and genes based on simulation results to achieve optimal experimental goals.
format Online
Article
Text
id pubmed-9225332
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92253322022-06-24 Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking Gagnon, Jake Pi, Lira Ryals, Matthew Wan, Qingwen Hu, Wenxing Ouyang, Zhengyu Zhang, Baohong Li, Kejie Life (Basel) Article To guide analysts to select the right tool and parameters in differential gene expression analyses of single-cell RNA sequencing (scRNA-seq) data, we developed a novel simulator that recapitulates the data characteristics of real scRNA-seq datasets while accounting for all the relevant sources of variation in a multi-subject, multi-condition scRNA-seq experiment: the cell-to-cell variation within a subject, the variation across subjects, the variability across cell types, the mean/variance relationship of gene expression across genes, library size effects, group effects, and covariate effects. By applying it to benchmark 12 differential gene expression analysis methods (including cell-level and pseudo-bulk methods) on simulated multi-condition, multi-subject data of the 10x Genomics platform, we demonstrated that methods originating from the negative binomial mixed model such as glmmTMB and NEBULA-HL outperformed other methods. Utilizing NEBULA-HL in a statistical analysis pipeline for single-cell analysis will enable scientists to better understand the cell-type-specific transcriptomic response to disease or treatment effects and to discover new drug targets. Further, application to two real datasets showed the outperformance of our differential expression (DE) pipeline, with unified findings of differentially expressed genes (DEG) and a pseudo-time trajectory transcriptomic result. In the end, we made recommendations for filtering strategies of cells and genes based on simulation results to achieve optimal experimental goals. MDPI 2022-06-07 /pmc/articles/PMC9225332/ /pubmed/35743881 http://dx.doi.org/10.3390/life12060850 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gagnon, Jake
Pi, Lira
Ryals, Matthew
Wan, Qingwen
Hu, Wenxing
Ouyang, Zhengyu
Zhang, Baohong
Li, Kejie
Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking
title Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking
title_full Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking
title_fullStr Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking
title_full_unstemmed Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking
title_short Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking
title_sort recommendations of scrna-seq differential gene expression analysis based on comprehensive benchmarking
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9225332/
https://www.ncbi.nlm.nih.gov/pubmed/35743881
http://dx.doi.org/10.3390/life12060850
work_keys_str_mv AT gagnonjake recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT pilira recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT ryalsmatthew recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT wanqingwen recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT huwenxing recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT ouyangzhengyu recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT zhangbaohong recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking
AT likejie recommendationsofscrnaseqdifferentialgeneexpressionanalysisbasedoncomprehensivebenchmarking