Cargando…

Vaeda computationally annotates doublets in single-cell RNA sequencing data

MOTIVATION: Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doubl...

Descripción completa

Detalles Bibliográficos
Autores principales: Schriever, Hannah, Kostka, Dennis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805559/
https://www.ncbi.nlm.nih.gov/pubmed/36342203
http://dx.doi.org/10.1093/bioinformatics/btac720
_version_ 1784862353038245888
author Schriever, Hannah
Kostka, Dennis
author_facet Schriever, Hannah
Kostka, Dennis
author_sort Schriever, Hannah
collection PubMed
description MOTIVATION: Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed. RESULTS: We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows. AVAILABILITY AND IMPLEMENTATION: Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9805559
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98055592023-01-03 Vaeda computationally annotates doublets in single-cell RNA sequencing data Schriever, Hannah Kostka, Dennis Bioinformatics Original Paper MOTIVATION: Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed. RESULTS: We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows. AVAILABILITY AND IMPLEMENTATION: Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-11-07 /pmc/articles/PMC9805559/ /pubmed/36342203 http://dx.doi.org/10.1093/bioinformatics/btac720 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Schriever, Hannah
Kostka, Dennis
Vaeda computationally annotates doublets in single-cell RNA sequencing data
title Vaeda computationally annotates doublets in single-cell RNA sequencing data
title_full Vaeda computationally annotates doublets in single-cell RNA sequencing data
title_fullStr Vaeda computationally annotates doublets in single-cell RNA sequencing data
title_full_unstemmed Vaeda computationally annotates doublets in single-cell RNA sequencing data
title_short Vaeda computationally annotates doublets in single-cell RNA sequencing data
title_sort vaeda computationally annotates doublets in single-cell rna sequencing data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805559/
https://www.ncbi.nlm.nih.gov/pubmed/36342203
http://dx.doi.org/10.1093/bioinformatics/btac720
work_keys_str_mv AT schrieverhannah vaedacomputationallyannotatesdoubletsinsinglecellrnasequencingdata
AT kostkadennis vaedacomputationallyannotatesdoubletsinsinglecellrnasequencingdata