Cargando…
Vaeda computationally annotates doublets in single-cell RNA sequencing data
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doubl...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805559/ https://www.ncbi.nlm.nih.gov/pubmed/36342203 http://dx.doi.org/10.1093/bioinformatics/btac720 |
_version_ | 1784862353038245888 |
---|---|
author | Schriever, Hannah Kostka, Dennis |
author_facet | Schriever, Hannah Kostka, Dennis |
author_sort | Schriever, Hannah |
collection | PubMed |
description | MOTIVATION: Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed. RESULTS: We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows. AVAILABILITY AND IMPLEMENTATION: Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9805559 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98055592023-01-03 Vaeda computationally annotates doublets in single-cell RNA sequencing data Schriever, Hannah Kostka, Dennis Bioinformatics Original Paper MOTIVATION: Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology’s utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed. RESULTS: We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows. AVAILABILITY AND IMPLEMENTATION: Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-11-07 /pmc/articles/PMC9805559/ /pubmed/36342203 http://dx.doi.org/10.1093/bioinformatics/btac720 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Schriever, Hannah Kostka, Dennis Vaeda computationally annotates doublets in single-cell RNA sequencing data |
title | Vaeda computationally annotates doublets in single-cell RNA sequencing data |
title_full | Vaeda computationally annotates doublets in single-cell RNA sequencing data |
title_fullStr | Vaeda computationally annotates doublets in single-cell RNA sequencing data |
title_full_unstemmed | Vaeda computationally annotates doublets in single-cell RNA sequencing data |
title_short | Vaeda computationally annotates doublets in single-cell RNA sequencing data |
title_sort | vaeda computationally annotates doublets in single-cell rna sequencing data |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805559/ https://www.ncbi.nlm.nih.gov/pubmed/36342203 http://dx.doi.org/10.1093/bioinformatics/btac720 |
work_keys_str_mv | AT schrieverhannah vaedacomputationallyannotatesdoubletsinsinglecellrnasequencingdata AT kostkadennis vaedacomputationallyannotatesdoubletsinsinglecellrnasequencingdata |