Cargando…

Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data

Single-cell RNA-sequencing (scRNA-seq) enables researchers to quantify transcriptomes of thousands of cells simultaneously and study transcriptomic changes between cells. scRNA-seq datasets increasingly include multisubject, multicondition experiments to investigate cell-type-specific differential s...

Descripción completa

Detalles Bibliográficos
Autores principales: Junttila, Sini, Smolander, Johannes, Elo, Laura L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9487674/
https://www.ncbi.nlm.nih.gov/pubmed/35880426
http://dx.doi.org/10.1093/bib/bbac286
_version_ 1784792503180853248
author Junttila, Sini
Smolander, Johannes
Elo, Laura L
author_facet Junttila, Sini
Smolander, Johannes
Elo, Laura L
author_sort Junttila, Sini
collection PubMed
description Single-cell RNA-sequencing (scRNA-seq) enables researchers to quantify transcriptomes of thousands of cells simultaneously and study transcriptomic changes between cells. scRNA-seq datasets increasingly include multisubject, multicondition experiments to investigate cell-type-specific differential states (DS) between conditions. This can be performed by first identifying the cell types in all the subjects and then by performing a DS analysis between the conditions within each cell type. Naïve single-cell DS analysis methods that treat cells statistically independent are subject to false positives in the presence of variation between biological replicates, an issue known as the pseudoreplicate bias. While several methods have already been introduced to carry out the statistical testing in multisubject scRNA-seq analysis, comparisons that include all these methods are currently lacking. Here, we performed a comprehensive comparison of 18 methods for the identification of DS changes between conditions from multisubject scRNA-seq data. Our results suggest that the pseudobulk methods performed generally best. Both pseudobulks and mixed models that model the subjects as a random effect were superior compared with the naïve single-cell methods that do not model the subjects in any way. While the naïve models achieved higher sensitivity than the pseudobulk methods and the mixed models, they were subject to a high number of false positives. In addition, accounting for subjects through latent variable modeling did not improve the performance of the naïve methods.
format Online
Article
Text
id pubmed-9487674
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94876742022-09-21 Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data Junttila, Sini Smolander, Johannes Elo, Laura L Brief Bioinform Problem Solving Protocol Single-cell RNA-sequencing (scRNA-seq) enables researchers to quantify transcriptomes of thousands of cells simultaneously and study transcriptomic changes between cells. scRNA-seq datasets increasingly include multisubject, multicondition experiments to investigate cell-type-specific differential states (DS) between conditions. This can be performed by first identifying the cell types in all the subjects and then by performing a DS analysis between the conditions within each cell type. Naïve single-cell DS analysis methods that treat cells statistically independent are subject to false positives in the presence of variation between biological replicates, an issue known as the pseudoreplicate bias. While several methods have already been introduced to carry out the statistical testing in multisubject scRNA-seq analysis, comparisons that include all these methods are currently lacking. Here, we performed a comprehensive comparison of 18 methods for the identification of DS changes between conditions from multisubject scRNA-seq data. Our results suggest that the pseudobulk methods performed generally best. Both pseudobulks and mixed models that model the subjects as a random effect were superior compared with the naïve single-cell methods that do not model the subjects in any way. While the naïve models achieved higher sensitivity than the pseudobulk methods and the mixed models, they were subject to a high number of false positives. In addition, accounting for subjects through latent variable modeling did not improve the performance of the naïve methods. Oxford University Press 2022-07-25 /pmc/articles/PMC9487674/ /pubmed/35880426 http://dx.doi.org/10.1093/bib/bbac286 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Junttila, Sini
Smolander, Johannes
Elo, Laura L
Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
title Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
title_full Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
title_fullStr Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
title_full_unstemmed Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
title_short Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
title_sort benchmarking methods for detecting differential states between conditions from multi-subject single-cell rna-seq data
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9487674/
https://www.ncbi.nlm.nih.gov/pubmed/35880426
http://dx.doi.org/10.1093/bib/bbac286
work_keys_str_mv AT junttilasini benchmarkingmethodsfordetectingdifferentialstatesbetweenconditionsfrommultisubjectsinglecellrnaseqdata
AT smolanderjohannes benchmarkingmethodsfordetectingdifferentialstatesbetweenconditionsfrommultisubjectsinglecellrnaseqdata
AT elolaural benchmarkingmethodsfordetectingdifferentialstatesbetweenconditionsfrommultisubjectsinglecellrnaseqdata