Cargando…

BUTTERFLY: addressing the pooled amplification paradox with unique molecular identifiers in single-cell RNA-seq

The incorporation of unique molecular identifiers (UMIs) in single-cell RNA-seq assays makes possible the identification of duplicated molecules, thereby facilitating the counting of distinct molecules from sequenced reads. However, we show that the naïve removal of duplicates can lead to a bias due...

Descripción completa

Detalles Bibliográficos
Autores principales: Gustafsson, Johan, Robinson, Jonathan, Nielsen, Jens, Pachter, Lior
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8188791/
https://www.ncbi.nlm.nih.gov/pubmed/34103073
http://dx.doi.org/10.1186/s13059-021-02386-z
Descripción
Sumario:The incorporation of unique molecular identifiers (UMIs) in single-cell RNA-seq assays makes possible the identification of duplicated molecules, thereby facilitating the counting of distinct molecules from sequenced reads. However, we show that the naïve removal of duplicates can lead to a bias due to a “pooled amplification paradox,” and we propose an improved quantification method based on unseen species modeling. Our correction called BUTTERFLY uses a zero truncated negative binomial estimator implemented in the kallisto bustools workflow. We demonstrate its efficacy across cell types and genes and show that in some cases it can invert the relative abundance of genes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02386-z.