Cargando…

Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data

Background: The commercially available 10x Genomics protocol to generate droplet-based single cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups...

Descripción completa

Detalles Bibliográficos
Autores principales: Freytag, Saskia, Tian, Luyi, Lönnstedt, Ingrid, Ng, Milica, Bahlo, Melanie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6124389/
https://www.ncbi.nlm.nih.gov/pubmed/30228881
http://dx.doi.org/10.12688/f1000research.15809.2
_version_ 1783353026239528960
author Freytag, Saskia
Tian, Luyi
Lönnstedt, Ingrid
Ng, Milica
Bahlo, Melanie
author_facet Freytag, Saskia
Tian, Luyi
Lönnstedt, Ingrid
Ng, Milica
Bahlo, Melanie
author_sort Freytag, Saskia
collection PubMed
description Background: The commercially available 10x Genomics protocol to generate droplet-based single cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups. Many competing methods have been proposed for this task, but there is currently little guidance with regards to which method to use. Methods: Here we use one gold standard 10x Genomics dataset, generated from the mixture of three cell lines, as well as multiple silver standard 10x Genomics datasets generated from peripheral blood mononuclear cells to examine not only the accuracy but also running time and robustness of a dozen methods. Results: We found that Seurat outperformed other methods, although performance seems to be dependent on many factors, including the complexity of the studied system. Furthermore, we found that solutions produced by different methods have little in common with each other. Conclusions: In light of this we conclude that the choice of clustering tool crucially determines interpretation of scRNA-seq data generated by 10x Genomics. Hence practitioners and consumers should remain vigilant about the outcome of 10x Genomics scRNA-seq analysis.
format Online
Article
Text
id pubmed-6124389
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-61243892018-09-17 Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data Freytag, Saskia Tian, Luyi Lönnstedt, Ingrid Ng, Milica Bahlo, Melanie F1000Res Research Article Background: The commercially available 10x Genomics protocol to generate droplet-based single cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups. Many competing methods have been proposed for this task, but there is currently little guidance with regards to which method to use. Methods: Here we use one gold standard 10x Genomics dataset, generated from the mixture of three cell lines, as well as multiple silver standard 10x Genomics datasets generated from peripheral blood mononuclear cells to examine not only the accuracy but also running time and robustness of a dozen methods. Results: We found that Seurat outperformed other methods, although performance seems to be dependent on many factors, including the complexity of the studied system. Furthermore, we found that solutions produced by different methods have little in common with each other. Conclusions: In light of this we conclude that the choice of clustering tool crucially determines interpretation of scRNA-seq data generated by 10x Genomics. Hence practitioners and consumers should remain vigilant about the outcome of 10x Genomics scRNA-seq analysis. F1000 Research Limited 2018-12-19 /pmc/articles/PMC6124389/ /pubmed/30228881 http://dx.doi.org/10.12688/f1000research.15809.2 Text en Copyright: © 2018 Freytag S et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Freytag, Saskia
Tian, Luyi
Lönnstedt, Ingrid
Ng, Milica
Bahlo, Melanie
Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data
title Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data
title_full Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data
title_fullStr Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data
title_full_unstemmed Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data
title_short Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data
title_sort comparison of clustering tools in r for medium-sized 10x genomics single-cell rna-sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6124389/
https://www.ncbi.nlm.nih.gov/pubmed/30228881
http://dx.doi.org/10.12688/f1000research.15809.2
work_keys_str_mv AT freytagsaskia comparisonofclusteringtoolsinrformediumsized10xgenomicssinglecellrnasequencingdata
AT tianluyi comparisonofclusteringtoolsinrformediumsized10xgenomicssinglecellrnasequencingdata
AT lonnstedtingrid comparisonofclusteringtoolsinrformediumsized10xgenomicssinglecellrnasequencingdata
AT ngmilica comparisonofclusteringtoolsinrformediumsized10xgenomicssinglecellrnasequencingdata
AT bahlomelanie comparisonofclusteringtoolsinrformediumsized10xgenomicssinglecellrnasequencingdata