Cargando…
Alevin efficiently estimates accurate gene abundances from dscRNA-seq data
We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication con...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437997/ https://www.ncbi.nlm.nih.gov/pubmed/30917859 http://dx.doi.org/10.1186/s13059-019-1670-y |
_version_ | 1783407037183426560 |
---|---|
author | Srivastava, Avi Malik, Laraib Smith, Tom Sudbery, Ian Patro, Rob |
author_facet | Srivastava, Avi Malik, Laraib Smith, Tom Sudbery, Ian Patro, Rob |
author_sort | Srivastava, Avi |
collection | PubMed |
description | We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen and accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads and improves the accuracy of gene abundance estimates. Alevin is considerably faster, typically eight times, than existing gene quantification approaches, while also using less memory. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-019-1670-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6437997 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64379972019-04-08 Alevin efficiently estimates accurate gene abundances from dscRNA-seq data Srivastava, Avi Malik, Laraib Smith, Tom Sudbery, Ian Patro, Rob Genome Biol Method We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen and accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads and improves the accuracy of gene abundance estimates. Alevin is considerably faster, typically eight times, than existing gene quantification approaches, while also using less memory. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-019-1670-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-27 /pmc/articles/PMC6437997/ /pubmed/30917859 http://dx.doi.org/10.1186/s13059-019-1670-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Method Srivastava, Avi Malik, Laraib Smith, Tom Sudbery, Ian Patro, Rob Alevin efficiently estimates accurate gene abundances from dscRNA-seq data |
title | Alevin efficiently estimates accurate gene abundances from dscRNA-seq data |
title_full | Alevin efficiently estimates accurate gene abundances from dscRNA-seq data |
title_fullStr | Alevin efficiently estimates accurate gene abundances from dscRNA-seq data |
title_full_unstemmed | Alevin efficiently estimates accurate gene abundances from dscRNA-seq data |
title_short | Alevin efficiently estimates accurate gene abundances from dscRNA-seq data |
title_sort | alevin efficiently estimates accurate gene abundances from dscrna-seq data |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437997/ https://www.ncbi.nlm.nih.gov/pubmed/30917859 http://dx.doi.org/10.1186/s13059-019-1670-y |
work_keys_str_mv | AT srivastavaavi alevinefficientlyestimatesaccurategeneabundancesfromdscrnaseqdata AT maliklaraib alevinefficientlyestimatesaccurategeneabundancesfromdscrnaseqdata AT smithtom alevinefficientlyestimatesaccurategeneabundancesfromdscrnaseqdata AT sudberyian alevinefficientlyestimatesaccurategeneabundancesfromdscrnaseqdata AT patrorob alevinefficientlyestimatesaccurategeneabundancesfromdscrnaseqdata |