Cargando…
Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6339299/ https://www.ncbi.nlm.nih.gov/pubmed/30658573 http://dx.doi.org/10.1186/s12859-019-2599-6 |
_version_ | 1783388606392434688 |
---|---|
author | Wang, Tianyu Li, Boyang Nelson, Craig E. Nabavi, Sheida |
author_facet | Wang, Tianyu Li, Boyang Nelson, Craig E. Nabavi, Sheida |
author_sort | Wang, Tianyu |
collection | PubMed |
description | BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq data, however, are highly heterogeneous and have a large number of zero counts, which introduces challenges in detecting DE genes. Addressing these challenges requires employing new approaches beyond the conventional ones, which are based on a nonzero difference in average expression. Several methods have been developed for differential gene expression analysis of scRNAseq data. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to evaluate and compare the performance of differential gene expression analysis methods for scRNAseq data. RESULTS: In this study, we conducted a comprehensive evaluation of the performance of eleven differential gene expression analysis software tools, which are designed for scRNAseq data or can be applied to them. We used simulated and real data to evaluate the accuracy and precision of detection. Using simulated data, we investigated the effect of sample size on the detection accuracy of the tools. Using real data, we examined the agreement among the tools in identifying DE genes, the run time of the tools, and the biological relevance of the detected DE genes. CONCLUSIONS: In general, agreement among the tools in calling DE genes is not high. There is a trade-off between true-positive rates and the precision of calling DE genes. Methods with higher true positive rates tend to show low precision due to their introducing false positives, whereas methods with high precision show low true positive rates due to identifying few DE genes. We observed that current methods designed for scRNAseq data do not tend to show better performance compared to methods designed for bulk RNAseq data. Data multimodality and abundance of zero read counts are the main characteristics of scRNAseq data, which play important roles in the performance of differential gene expression analysis methods and need to be considered in terms of the development of new methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2599-6) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6339299 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63392992019-01-23 Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data Wang, Tianyu Li, Boyang Nelson, Craig E. Nabavi, Sheida BMC Bioinformatics Research Article BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq data, however, are highly heterogeneous and have a large number of zero counts, which introduces challenges in detecting DE genes. Addressing these challenges requires employing new approaches beyond the conventional ones, which are based on a nonzero difference in average expression. Several methods have been developed for differential gene expression analysis of scRNAseq data. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to evaluate and compare the performance of differential gene expression analysis methods for scRNAseq data. RESULTS: In this study, we conducted a comprehensive evaluation of the performance of eleven differential gene expression analysis software tools, which are designed for scRNAseq data or can be applied to them. We used simulated and real data to evaluate the accuracy and precision of detection. Using simulated data, we investigated the effect of sample size on the detection accuracy of the tools. Using real data, we examined the agreement among the tools in identifying DE genes, the run time of the tools, and the biological relevance of the detected DE genes. CONCLUSIONS: In general, agreement among the tools in calling DE genes is not high. There is a trade-off between true-positive rates and the precision of calling DE genes. Methods with higher true positive rates tend to show low precision due to their introducing false positives, whereas methods with high precision show low true positive rates due to identifying few DE genes. We observed that current methods designed for scRNAseq data do not tend to show better performance compared to methods designed for bulk RNAseq data. Data multimodality and abundance of zero read counts are the main characteristics of scRNAseq data, which play important roles in the performance of differential gene expression analysis methods and need to be considered in terms of the development of new methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2599-6) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-18 /pmc/articles/PMC6339299/ /pubmed/30658573 http://dx.doi.org/10.1186/s12859-019-2599-6 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Wang, Tianyu Li, Boyang Nelson, Craig E. Nabavi, Sheida Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data |
title | Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data |
title_full | Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data |
title_fullStr | Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data |
title_full_unstemmed | Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data |
title_short | Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data |
title_sort | comparative analysis of differential gene expression analysis tools for single-cell rna sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6339299/ https://www.ncbi.nlm.nih.gov/pubmed/30658573 http://dx.doi.org/10.1186/s12859-019-2599-6 |
work_keys_str_mv | AT wangtianyu comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata AT liboyang comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata AT nelsoncraige comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata AT nabavisheida comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata |