Cargando…

Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data

BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Tianyu, Li, Boyang, Nelson, Craig E., Nabavi, Sheida
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6339299/
https://www.ncbi.nlm.nih.gov/pubmed/30658573
http://dx.doi.org/10.1186/s12859-019-2599-6
_version_ 1783388606392434688
author Wang, Tianyu
Li, Boyang
Nelson, Craig E.
Nabavi, Sheida
author_facet Wang, Tianyu
Li, Boyang
Nelson, Craig E.
Nabavi, Sheida
author_sort Wang, Tianyu
collection PubMed
description BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq data, however, are highly heterogeneous and have a large number of zero counts, which introduces challenges in detecting DE genes. Addressing these challenges requires employing new approaches beyond the conventional ones, which are based on a nonzero difference in average expression. Several methods have been developed for differential gene expression analysis of scRNAseq data. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to evaluate and compare the performance of differential gene expression analysis methods for scRNAseq data. RESULTS: In this study, we conducted a comprehensive evaluation of the performance of eleven differential gene expression analysis software tools, which are designed for scRNAseq data or can be applied to them. We used simulated and real data to evaluate the accuracy and precision of detection. Using simulated data, we investigated the effect of sample size on the detection accuracy of the tools. Using real data, we examined the agreement among the tools in identifying DE genes, the run time of the tools, and the biological relevance of the detected DE genes. CONCLUSIONS: In general, agreement among the tools in calling DE genes is not high. There is a trade-off between true-positive rates and the precision of calling DE genes. Methods with higher true positive rates tend to show low precision due to their introducing false positives, whereas methods with high precision show low true positive rates due to identifying few DE genes. We observed that current methods designed for scRNAseq data do not tend to show better performance compared to methods designed for bulk RNAseq data. Data multimodality and abundance of zero read counts are the main characteristics of scRNAseq data, which play important roles in the performance of differential gene expression analysis methods and need to be considered in terms of the development of new methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2599-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6339299
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63392992019-01-23 Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data Wang, Tianyu Li, Boyang Nelson, Craig E. Nabavi, Sheida BMC Bioinformatics Research Article BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq data, however, are highly heterogeneous and have a large number of zero counts, which introduces challenges in detecting DE genes. Addressing these challenges requires employing new approaches beyond the conventional ones, which are based on a nonzero difference in average expression. Several methods have been developed for differential gene expression analysis of scRNAseq data. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to evaluate and compare the performance of differential gene expression analysis methods for scRNAseq data. RESULTS: In this study, we conducted a comprehensive evaluation of the performance of eleven differential gene expression analysis software tools, which are designed for scRNAseq data or can be applied to them. We used simulated and real data to evaluate the accuracy and precision of detection. Using simulated data, we investigated the effect of sample size on the detection accuracy of the tools. Using real data, we examined the agreement among the tools in identifying DE genes, the run time of the tools, and the biological relevance of the detected DE genes. CONCLUSIONS: In general, agreement among the tools in calling DE genes is not high. There is a trade-off between true-positive rates and the precision of calling DE genes. Methods with higher true positive rates tend to show low precision due to their introducing false positives, whereas methods with high precision show low true positive rates due to identifying few DE genes. We observed that current methods designed for scRNAseq data do not tend to show better performance compared to methods designed for bulk RNAseq data. Data multimodality and abundance of zero read counts are the main characteristics of scRNAseq data, which play important roles in the performance of differential gene expression analysis methods and need to be considered in terms of the development of new methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2599-6) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-18 /pmc/articles/PMC6339299/ /pubmed/30658573 http://dx.doi.org/10.1186/s12859-019-2599-6 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wang, Tianyu
Li, Boyang
Nelson, Craig E.
Nabavi, Sheida
Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
title Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
title_full Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
title_fullStr Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
title_full_unstemmed Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
title_short Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
title_sort comparative analysis of differential gene expression analysis tools for single-cell rna sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6339299/
https://www.ncbi.nlm.nih.gov/pubmed/30658573
http://dx.doi.org/10.1186/s12859-019-2599-6
work_keys_str_mv AT wangtianyu comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata
AT liboyang comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata
AT nelsoncraige comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata
AT nabavisheida comparativeanalysisofdifferentialgeneexpressionanalysistoolsforsinglecellrnasequencingdata