Cargando…

Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing

Detection of differentially expressed genes is a common task in single-cell RNA-seq (scRNA-seq) studies. Various methods based on both bulk-cell and single-cell approaches are in current use. Due to the unique distributional characteristics of single-cell data, it is important to compare these metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Mou, Tian, Deng, Wenjiang, Gu, Fengyun, Pawitan, Yudi, Vu, Trung Nghia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6979262/
https://www.ncbi.nlm.nih.gov/pubmed/32010190
http://dx.doi.org/10.3389/fgene.2019.01331
_version_ 1783490862262517760
author Mou, Tian
Deng, Wenjiang
Gu, Fengyun
Pawitan, Yudi
Vu, Trung Nghia
author_facet Mou, Tian
Deng, Wenjiang
Gu, Fengyun
Pawitan, Yudi
Vu, Trung Nghia
author_sort Mou, Tian
collection PubMed
description Detection of differentially expressed genes is a common task in single-cell RNA-seq (scRNA-seq) studies. Various methods based on both bulk-cell and single-cell approaches are in current use. Due to the unique distributional characteristics of single-cell data, it is important to compare these methods with rigorous statistical assessments. In this study, we assess the reproducibility of 9 tools for differential expression analysis in scRNA-seq data. These tools include four methods originally designed for scRNA-seq data, three popular methods originally developed for bulk-cell RNA-seq data but have been applied in scRNA-seq analysis, and two general statistical tests. Instead of comparing the performance across all genes, we compare the methods in terms of the rediscovery rates (RDRs) of top-ranked genes, separately for highly and lowly expressed genes. Three real and one simulated scRNA-seq data sets are used for the comparisons. The results indicate that some widely used methods, such as edgeR and monocle, have worse RDR performances compared to the other methods, especially for the top-ranked genes. For highly expressed genes, many bulk-cell–based methods can perform similarly to the methods designed for scRNA-seq data. But for the lowly expressed genes performance varies substantially; edgeR and monocle are too liberal and have poor control of false positives, while DESeq2 is too conservative and consequently loses sensitivity compared to the other methods. BPSC, Limma, DEsingle, MAST, t-test and Wilcoxon have similar performances in the real data sets. Overall, the scRNA-seq based method BPSC performs well against the other methods, particularly when there is a sufficient number of cells.
format Online
Article
Text
id pubmed-6979262
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-69792622020-02-01 Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing Mou, Tian Deng, Wenjiang Gu, Fengyun Pawitan, Yudi Vu, Trung Nghia Front Genet Genetics Detection of differentially expressed genes is a common task in single-cell RNA-seq (scRNA-seq) studies. Various methods based on both bulk-cell and single-cell approaches are in current use. Due to the unique distributional characteristics of single-cell data, it is important to compare these methods with rigorous statistical assessments. In this study, we assess the reproducibility of 9 tools for differential expression analysis in scRNA-seq data. These tools include four methods originally designed for scRNA-seq data, three popular methods originally developed for bulk-cell RNA-seq data but have been applied in scRNA-seq analysis, and two general statistical tests. Instead of comparing the performance across all genes, we compare the methods in terms of the rediscovery rates (RDRs) of top-ranked genes, separately for highly and lowly expressed genes. Three real and one simulated scRNA-seq data sets are used for the comparisons. The results indicate that some widely used methods, such as edgeR and monocle, have worse RDR performances compared to the other methods, especially for the top-ranked genes. For highly expressed genes, many bulk-cell–based methods can perform similarly to the methods designed for scRNA-seq data. But for the lowly expressed genes performance varies substantially; edgeR and monocle are too liberal and have poor control of false positives, while DESeq2 is too conservative and consequently loses sensitivity compared to the other methods. BPSC, Limma, DEsingle, MAST, t-test and Wilcoxon have similar performances in the real data sets. Overall, the scRNA-seq based method BPSC performs well against the other methods, particularly when there is a sufficient number of cells. Frontiers Media S.A. 2020-01-17 /pmc/articles/PMC6979262/ /pubmed/32010190 http://dx.doi.org/10.3389/fgene.2019.01331 Text en Copyright © 2020 Mou, Deng, Gu, Pawitan and Vu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Mou, Tian
Deng, Wenjiang
Gu, Fengyun
Pawitan, Yudi
Vu, Trung Nghia
Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
title Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
title_full Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
title_fullStr Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
title_full_unstemmed Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
title_short Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
title_sort reproducibility of methods to detect differentially expressed genes from single-cell rna sequencing
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6979262/
https://www.ncbi.nlm.nih.gov/pubmed/32010190
http://dx.doi.org/10.3389/fgene.2019.01331
work_keys_str_mv AT moutian reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing
AT dengwenjiang reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing
AT gufengyun reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing
AT pawitanyudi reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing
AT vutrungnghia reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing