Cargando…

Evaluation of single-cell classifiers for single-cell RNA sequencing data sets

Single-cell RNA sequencing (scRNA-seq) has been rapidly developing and widely applied in biological and medical research. Identification of cell types in scRNA-seq data sets is an essential step before in-depth investigations of their functional and pathological roles. However, the conventional work...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhao, Xinlei, Wu, Shuang, Fang, Nan, Sun, Xiao, Fan, Jue
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Review Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947964/ https://www.ncbi.nlm.nih.gov/pubmed/31675098 http://dx.doi.org/10.1093/bib/bbz096

_version_	1783663337436872704
author	Zhao, Xinlei Wu, Shuang Fang, Nan Sun, Xiao Fan, Jue
author_facet	Zhao, Xinlei Wu, Shuang Fang, Nan Sun, Xiao Fan, Jue
author_sort	Zhao, Xinlei
collection	PubMed
description	Single-cell RNA sequencing (scRNA-seq) has been rapidly developing and widely applied in biological and medical research. Identification of cell types in scRNA-seq data sets is an essential step before in-depth investigations of their functional and pathological roles. However, the conventional workflow based on clustering and marker genes is not scalable for an increasingly large number of scRNA-seq data sets due to complicated procedures and manual annotation. Therefore, a number of tools have been developed recently to predict cell types in new data sets using reference data sets. These methods have not been generally adapted due to a lack of tool benchmarking and user guidance. In this article, we performed a comprehensive and impartial evaluation of nine classification software tools specifically designed for scRNA-seq data sets. Results showed that Seurat based on random forest, SingleR based on correlation analysis and CaSTLe based on XGBoost performed better than others. A simple ensemble voting of all tools can improve the predictive accuracy. Under nonideal situations, such as small-sized and class-imbalanced reference data sets, tools based on cluster-level similarities have superior performance. However, even with the function of assigning ‘unassigned’ labels, it is still challenging to catch novel cell types by solely using any of the single-cell classifiers. This article provides a guideline for researchers to select and apply suitable classification tools in their analysis workflows and sheds some lights on potential direction of future improvement on classification tools.
format	Online Article Text
id	pubmed-7947964
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-79479642021-03-16 Evaluation of single-cell classifiers for single-cell RNA sequencing data sets Zhao, Xinlei Wu, Shuang Fang, Nan Sun, Xiao Fan, Jue Brief Bioinform Review Article Single-cell RNA sequencing (scRNA-seq) has been rapidly developing and widely applied in biological and medical research. Identification of cell types in scRNA-seq data sets is an essential step before in-depth investigations of their functional and pathological roles. However, the conventional workflow based on clustering and marker genes is not scalable for an increasingly large number of scRNA-seq data sets due to complicated procedures and manual annotation. Therefore, a number of tools have been developed recently to predict cell types in new data sets using reference data sets. These methods have not been generally adapted due to a lack of tool benchmarking and user guidance. In this article, we performed a comprehensive and impartial evaluation of nine classification software tools specifically designed for scRNA-seq data sets. Results showed that Seurat based on random forest, SingleR based on correlation analysis and CaSTLe based on XGBoost performed better than others. A simple ensemble voting of all tools can improve the predictive accuracy. Under nonideal situations, such as small-sized and class-imbalanced reference data sets, tools based on cluster-level similarities have superior performance. However, even with the function of assigning ‘unassigned’ labels, it is still challenging to catch novel cell types by solely using any of the single-cell classifiers. This article provides a guideline for researchers to select and apply suitable classification tools in their analysis workflows and sheds some lights on potential direction of future improvement on classification tools. Oxford University Press 2019-10-23 /pmc/articles/PMC7947964/ /pubmed/31675098 http://dx.doi.org/10.1093/bib/bbz096 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Review Article Zhao, Xinlei Wu, Shuang Fang, Nan Sun, Xiao Fan, Jue Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
title	Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
title_full	Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
title_fullStr	Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
title_full_unstemmed	Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
title_short	Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
title_sort	evaluation of single-cell classifiers for single-cell rna sequencing data sets
topic	Review Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947964/ https://www.ncbi.nlm.nih.gov/pubmed/31675098 http://dx.doi.org/10.1093/bib/bbz096
work_keys_str_mv	AT zhaoxinlei evaluationofsinglecellclassifiersforsinglecellrnasequencingdatasets AT wushuang evaluationofsinglecellclassifiersforsinglecellrnasequencingdatasets AT fangnan evaluationofsinglecellclassifiersforsinglecellrnasequencingdatasets AT sunxiao evaluationofsinglecellclassifiersforsinglecellrnasequencingdatasets AT fanjue evaluationofsinglecellclassifiersforsinglecellrnasequencingdatasets

Evaluation of single-cell classifiers for single-cell RNA sequencing data sets

Ejemplares similares