Cargando…

Demystifying “drop-outs” in single-cell UMI data

Many existing pipelines for scRNA-seq data apply pre-processing steps such as normalization or imputation to account for excessive zeros or “drop-outs." Here, we extensively analyze diverse UMI data sets to show that clustering should be the foremost step of the workflow. We observe that most d...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Tae Hyun, Zhou, Xiang, Chen, Mengjie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7412673/
https://www.ncbi.nlm.nih.gov/pubmed/32762710
http://dx.doi.org/10.1186/s13059-020-02096-y
Descripción
Sumario:Many existing pipelines for scRNA-seq data apply pre-processing steps such as normalization or imputation to account for excessive zeros or “drop-outs." Here, we extensively analyze diverse UMI data sets to show that clustering should be the foremost step of the workflow. We observe that most drop-outs disappear once cell-type heterogeneity is resolved, while imputing or normalizing heterogeneous data can introduce unwanted noise. We propose a novel framework HIPPO (Heterogeneity-Inspired Pre-Processing tOol) that leverages zero proportions to explain cellular heterogeneity and integrates feature selection with iterative clustering. HIPPO leads to downstream analysis with greater flexibility and interpretability compared to alternatives.