Cargando…

Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data

Quantifying the activity of gene expression signatures is common in analyses of single-cell RNA sequencing data. Methods originally developed for bulk samples are often used for this purpose without accounting for contextual differences between bulk and single-cell data. More broadly, few attempts h...

Descripción completa

Detalles Bibliográficos
Autores principales: Noureen, Nighat, Ye, Zhenqing, Chen, Yidong, Wang, Xiaojing, Zheng, Siyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: eLife Sciences Publications, Ltd 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8916770/
https://www.ncbi.nlm.nih.gov/pubmed/35212622
http://dx.doi.org/10.7554/eLife.71994
_version_ 1784668388335812608
author Noureen, Nighat
Ye, Zhenqing
Chen, Yidong
Wang, Xiaojing
Zheng, Siyuan
author_facet Noureen, Nighat
Ye, Zhenqing
Chen, Yidong
Wang, Xiaojing
Zheng, Siyuan
author_sort Noureen, Nighat
collection PubMed
description Quantifying the activity of gene expression signatures is common in analyses of single-cell RNA sequencing data. Methods originally developed for bulk samples are often used for this purpose without accounting for contextual differences between bulk and single-cell data. More broadly, few attempts have been made to benchmark these methods. Here, we benchmark five such methods, including single sample gene set enrichment analysis (ssGSEA), Gene Set Variation Analysis (GSVA), AUCell, Single Cell Signature Explorer (SCSE), and a new method we developed, Jointly Assessing Signature Mean and Inferring Enrichment (JASMINE). Using cancer as an example, we show cancer cells consistently express more genes than normal cells. This imbalance leads to bias in performance by bulk-sample-based ssGSEA in gold standard tests and down sampling experiments. In contrast, single-cell-based methods are less susceptible. Our results suggest caution should be exercised when using bulk-sample-based methods in single-cell data analyses, and cellular contexts should be taken into consideration when designing benchmarking strategies.
format Online
Article
Text
id pubmed-8916770
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher eLife Sciences Publications, Ltd
record_format MEDLINE/PubMed
spelling pubmed-89167702022-03-12 Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data Noureen, Nighat Ye, Zhenqing Chen, Yidong Wang, Xiaojing Zheng, Siyuan eLife Cancer Biology Quantifying the activity of gene expression signatures is common in analyses of single-cell RNA sequencing data. Methods originally developed for bulk samples are often used for this purpose without accounting for contextual differences between bulk and single-cell data. More broadly, few attempts have been made to benchmark these methods. Here, we benchmark five such methods, including single sample gene set enrichment analysis (ssGSEA), Gene Set Variation Analysis (GSVA), AUCell, Single Cell Signature Explorer (SCSE), and a new method we developed, Jointly Assessing Signature Mean and Inferring Enrichment (JASMINE). Using cancer as an example, we show cancer cells consistently express more genes than normal cells. This imbalance leads to bias in performance by bulk-sample-based ssGSEA in gold standard tests and down sampling experiments. In contrast, single-cell-based methods are less susceptible. Our results suggest caution should be exercised when using bulk-sample-based methods in single-cell data analyses, and cellular contexts should be taken into consideration when designing benchmarking strategies. eLife Sciences Publications, Ltd 2022-02-25 /pmc/articles/PMC8916770/ /pubmed/35212622 http://dx.doi.org/10.7554/eLife.71994 Text en © 2022, Noureen et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Cancer Biology
Noureen, Nighat
Ye, Zhenqing
Chen, Yidong
Wang, Xiaojing
Zheng, Siyuan
Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
title Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
title_full Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
title_fullStr Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
title_full_unstemmed Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
title_short Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
title_sort signature-scoring methods developed for bulk samples are not adequate for cancer single-cell rna sequencing data
topic Cancer Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8916770/
https://www.ncbi.nlm.nih.gov/pubmed/35212622
http://dx.doi.org/10.7554/eLife.71994
work_keys_str_mv AT noureennighat signaturescoringmethodsdevelopedforbulksamplesarenotadequateforcancersinglecellrnasequencingdata
AT yezhenqing signaturescoringmethodsdevelopedforbulksamplesarenotadequateforcancersinglecellrnasequencingdata
AT chenyidong signaturescoringmethodsdevelopedforbulksamplesarenotadequateforcancersinglecellrnasequencingdata
AT wangxiaojing signaturescoringmethodsdevelopedforbulksamplesarenotadequateforcancersinglecellrnasequencingdata
AT zhengsiyuan signaturescoringmethodsdevelopedforbulksamplesarenotadequateforcancersinglecellrnasequencingdata