Cargando…

demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models

MOTIVATION: Droplet-based single-cell RNA sequencing (scRNA-seq) is widely used in biomedical research to interrogate the transcriptomes of single cells on a large scale. Pooling and processing cells from different samples together can reduce costs and batch effects. In order to pool cells, cells ar...

Descripción completa

Detalles Bibliográficos
Autor principal: Klein, Hans-Ulrich
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901175/
https://www.ncbi.nlm.nih.gov/pubmed/36747615
http://dx.doi.org/10.1101/2023.01.27.525961
_version_ 1784882987842666496
author Klein, Hans-Ulrich
author_facet Klein, Hans-Ulrich
author_sort Klein, Hans-Ulrich
collection PubMed
description MOTIVATION: Droplet-based single-cell RNA sequencing (scRNA-seq) is widely used in biomedical research to interrogate the transcriptomes of single cells on a large scale. Pooling and processing cells from different samples together can reduce costs and batch effects. In order to pool cells, cells are often first labeled with hashtag oligonucleotides (HTOs). These HTOs are sequenced along with the cells’ RNA in the droplets and are subsequently used to computationally assign each droplet to its sample of origin, which is referred to as demultiplexing. Accurate demultiplexing is crucial and can be challenging due to background HTOs, low-quality cells/cell debris, and multiplets. RESULTS: A new demultiplexing method, demuxmix, based on negative binomial regression mixture models is introduced. The method implements two significant improvements. First, demuxmix’s probabilistic classification framework provides error probabilities for droplet assignments that can be used to discard uncertain droplets and inform about the quality of the HTO data and the demultiplexing success. Second, demuxmix utilizes the positive association between detected genes in the RNA library and HTO counts to explain parts of the variance in the HTO data resulting in improved droplet assignments. The improved performance of demuxmix compared to existing demultiplexing methods is assessed on real and simulated data. Finally, the feasibility of accurately demultiplexing experimental designs where non-labeled cells are pooled with labeled cells is demonstrated. AVAILABILITY: R/Bioconductor package demuxmix (https://doi.org/doi:10.18129/B9.bioc.demuxmix)
format Online
Article
Text
id pubmed-9901175
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-99011752023-02-07 demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models Klein, Hans-Ulrich bioRxiv Article MOTIVATION: Droplet-based single-cell RNA sequencing (scRNA-seq) is widely used in biomedical research to interrogate the transcriptomes of single cells on a large scale. Pooling and processing cells from different samples together can reduce costs and batch effects. In order to pool cells, cells are often first labeled with hashtag oligonucleotides (HTOs). These HTOs are sequenced along with the cells’ RNA in the droplets and are subsequently used to computationally assign each droplet to its sample of origin, which is referred to as demultiplexing. Accurate demultiplexing is crucial and can be challenging due to background HTOs, low-quality cells/cell debris, and multiplets. RESULTS: A new demultiplexing method, demuxmix, based on negative binomial regression mixture models is introduced. The method implements two significant improvements. First, demuxmix’s probabilistic classification framework provides error probabilities for droplet assignments that can be used to discard uncertain droplets and inform about the quality of the HTO data and the demultiplexing success. Second, demuxmix utilizes the positive association between detected genes in the RNA library and HTO counts to explain parts of the variance in the HTO data resulting in improved droplet assignments. The improved performance of demuxmix compared to existing demultiplexing methods is assessed on real and simulated data. Finally, the feasibility of accurately demultiplexing experimental designs where non-labeled cells are pooled with labeled cells is demonstrated. AVAILABILITY: R/Bioconductor package demuxmix (https://doi.org/doi:10.18129/B9.bioc.demuxmix) Cold Spring Harbor Laboratory 2023-01-29 /pmc/articles/PMC9901175/ /pubmed/36747615 http://dx.doi.org/10.1101/2023.01.27.525961 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Klein, Hans-Ulrich
demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models
title demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models
title_full demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models
title_fullStr demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models
title_full_unstemmed demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models
title_short demuxmix: Demultiplexing oligonucleotide-barcoded single-cell RNA sequencing data with regression mixture models
title_sort demuxmix: demultiplexing oligonucleotide-barcoded single-cell rna sequencing data with regression mixture models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901175/
https://www.ncbi.nlm.nih.gov/pubmed/36747615
http://dx.doi.org/10.1101/2023.01.27.525961
work_keys_str_mv AT kleinhansulrich demuxmixdemultiplexingoligonucleotidebarcodedsinglecellrnasequencingdatawithregressionmixturemodels