Cargando…

Alevin efficiently estimates accurate gene abundances from dscRNA-seq data

We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication con...

Descripción completa

Detalles Bibliográficos
Autores principales: Srivastava, Avi, Malik, Laraib, Smith, Tom, Sudbery, Ian, Patro, Rob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437997/
https://www.ncbi.nlm.nih.gov/pubmed/30917859
http://dx.doi.org/10.1186/s13059-019-1670-y
Descripción
Sumario:We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen and accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads and improves the accuracy of gene abundance estimates. Alevin is considerably faster, typically eight times, than existing gene quantification approaches, while also using less memory. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-019-1670-y) contains supplementary material, which is available to authorized users.