Cargando…
Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing
Detection of differentially expressed genes is a common task in single-cell RNA-seq (scRNA-seq) studies. Various methods based on both bulk-cell and single-cell approaches are in current use. Due to the unique distributional characteristics of single-cell data, it is important to compare these metho...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6979262/ https://www.ncbi.nlm.nih.gov/pubmed/32010190 http://dx.doi.org/10.3389/fgene.2019.01331 |
_version_ | 1783490862262517760 |
---|---|
author | Mou, Tian Deng, Wenjiang Gu, Fengyun Pawitan, Yudi Vu, Trung Nghia |
author_facet | Mou, Tian Deng, Wenjiang Gu, Fengyun Pawitan, Yudi Vu, Trung Nghia |
author_sort | Mou, Tian |
collection | PubMed |
description | Detection of differentially expressed genes is a common task in single-cell RNA-seq (scRNA-seq) studies. Various methods based on both bulk-cell and single-cell approaches are in current use. Due to the unique distributional characteristics of single-cell data, it is important to compare these methods with rigorous statistical assessments. In this study, we assess the reproducibility of 9 tools for differential expression analysis in scRNA-seq data. These tools include four methods originally designed for scRNA-seq data, three popular methods originally developed for bulk-cell RNA-seq data but have been applied in scRNA-seq analysis, and two general statistical tests. Instead of comparing the performance across all genes, we compare the methods in terms of the rediscovery rates (RDRs) of top-ranked genes, separately for highly and lowly expressed genes. Three real and one simulated scRNA-seq data sets are used for the comparisons. The results indicate that some widely used methods, such as edgeR and monocle, have worse RDR performances compared to the other methods, especially for the top-ranked genes. For highly expressed genes, many bulk-cell–based methods can perform similarly to the methods designed for scRNA-seq data. But for the lowly expressed genes performance varies substantially; edgeR and monocle are too liberal and have poor control of false positives, while DESeq2 is too conservative and consequently loses sensitivity compared to the other methods. BPSC, Limma, DEsingle, MAST, t-test and Wilcoxon have similar performances in the real data sets. Overall, the scRNA-seq based method BPSC performs well against the other methods, particularly when there is a sufficient number of cells. |
format | Online Article Text |
id | pubmed-6979262 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-69792622020-02-01 Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing Mou, Tian Deng, Wenjiang Gu, Fengyun Pawitan, Yudi Vu, Trung Nghia Front Genet Genetics Detection of differentially expressed genes is a common task in single-cell RNA-seq (scRNA-seq) studies. Various methods based on both bulk-cell and single-cell approaches are in current use. Due to the unique distributional characteristics of single-cell data, it is important to compare these methods with rigorous statistical assessments. In this study, we assess the reproducibility of 9 tools for differential expression analysis in scRNA-seq data. These tools include four methods originally designed for scRNA-seq data, three popular methods originally developed for bulk-cell RNA-seq data but have been applied in scRNA-seq analysis, and two general statistical tests. Instead of comparing the performance across all genes, we compare the methods in terms of the rediscovery rates (RDRs) of top-ranked genes, separately for highly and lowly expressed genes. Three real and one simulated scRNA-seq data sets are used for the comparisons. The results indicate that some widely used methods, such as edgeR and monocle, have worse RDR performances compared to the other methods, especially for the top-ranked genes. For highly expressed genes, many bulk-cell–based methods can perform similarly to the methods designed for scRNA-seq data. But for the lowly expressed genes performance varies substantially; edgeR and monocle are too liberal and have poor control of false positives, while DESeq2 is too conservative and consequently loses sensitivity compared to the other methods. BPSC, Limma, DEsingle, MAST, t-test and Wilcoxon have similar performances in the real data sets. Overall, the scRNA-seq based method BPSC performs well against the other methods, particularly when there is a sufficient number of cells. Frontiers Media S.A. 2020-01-17 /pmc/articles/PMC6979262/ /pubmed/32010190 http://dx.doi.org/10.3389/fgene.2019.01331 Text en Copyright © 2020 Mou, Deng, Gu, Pawitan and Vu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Mou, Tian Deng, Wenjiang Gu, Fengyun Pawitan, Yudi Vu, Trung Nghia Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing |
title | Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing |
title_full | Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing |
title_fullStr | Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing |
title_full_unstemmed | Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing |
title_short | Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing |
title_sort | reproducibility of methods to detect differentially expressed genes from single-cell rna sequencing |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6979262/ https://www.ncbi.nlm.nih.gov/pubmed/32010190 http://dx.doi.org/10.3389/fgene.2019.01331 |
work_keys_str_mv | AT moutian reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing AT dengwenjiang reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing AT gufengyun reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing AT pawitanyudi reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing AT vutrungnghia reproducibilityofmethodstodetectdifferentiallyexpressedgenesfromsinglecellrnasequencing |