Cargando…

PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

MOTIVATION: New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity. RESULTS: We introduce a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Stassen, Shobana V, Siu, Dickson M D, Lee, Kelvin C M, Ho, Joshua W K, So, Hayden K H, Tsia, Kevin K
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2020
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7203756/ https://www.ncbi.nlm.nih.gov/pubmed/31971583 http://dx.doi.org/10.1093/bioinformatics/btaa042

Descripción
Sumario:	MOTIVATION: New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity. RESULTS: We introduce a highly scalable graph-based clustering algorithm PARC—Phenotyping by Accelerated Refined Community-partitioning—for large-scale, high-dimensional single-cell data (>1 million cells). Using large single-cell flow and mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without subsampling of cells, including Phenograph, FlowSOM and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single-cell dataset of 1.1 million cells within 13 min, compared with >2 h for the next fastest graph-clustering algorithm. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis. AVAILABILITY AND IMPLEMENTATION: https://github.com/ShobiStassen/PARC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

Ejemplares similares