Cargando…

doubletD: detecting doublets in single-cell DNA sequencing data

MOTIVATION: While single-cell DNA sequencing (scDNA-seq) has enabled the study of intratumor heterogeneity at an unprecedented resolution, current technologies are error-prone and often result in doublets where two or more cells are mistaken for a single cell. Not only do doublets confound downstrea...

Descripción completa

Detalles Bibliográficos
Autores principales: Weber, Leah L, Sashittal, Palash, El-Kebir, Mohammed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275324/
https://www.ncbi.nlm.nih.gov/pubmed/34252961
http://dx.doi.org/10.1093/bioinformatics/btab266
_version_ 1783721689833537536
author Weber, Leah L
Sashittal, Palash
El-Kebir, Mohammed
author_facet Weber, Leah L
Sashittal, Palash
El-Kebir, Mohammed
author_sort Weber, Leah L
collection PubMed
description MOTIVATION: While single-cell DNA sequencing (scDNA-seq) has enabled the study of intratumor heterogeneity at an unprecedented resolution, current technologies are error-prone and often result in doublets where two or more cells are mistaken for a single cell. Not only do doublets confound downstream analyses, but the increase in doublet rate is also a major bottleneck preventing higher throughput with current single-cell technologies. Although doublet detection and removal are standard practice in scRNA-seq data analysis, options for scDNA-seq data are limited. Current methods attempt to detect doublets while also performing complex downstream analyses tasks, leading to decreased efficiency and/or performance. RESULTS: We present doubletD, the first standalone method for detecting doublets in scDNA-seq data. Underlying our method is a simple maximum likelihood approach with a closed-form solution. We demonstrate the performance of doubletD on simulated data as well as real datasets, outperforming current methods for downstream analysis of scDNA-seq data that jointly infer doublets as well as standalone approaches for doublet detection in scRNA-seq data. Incorporating doubletD in scDNA-seq analysis pipelines will reduce complexity and lead to more accurate results. AVAILABILITY AND IMPLEMENTATION: https://github.com/elkebir-group/doubletD. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8275324
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-82753242021-07-13 doubletD: detecting doublets in single-cell DNA sequencing data Weber, Leah L Sashittal, Palash El-Kebir, Mohammed Bioinformatics Genome Sequence Analysis MOTIVATION: While single-cell DNA sequencing (scDNA-seq) has enabled the study of intratumor heterogeneity at an unprecedented resolution, current technologies are error-prone and often result in doublets where two or more cells are mistaken for a single cell. Not only do doublets confound downstream analyses, but the increase in doublet rate is also a major bottleneck preventing higher throughput with current single-cell technologies. Although doublet detection and removal are standard practice in scRNA-seq data analysis, options for scDNA-seq data are limited. Current methods attempt to detect doublets while also performing complex downstream analyses tasks, leading to decreased efficiency and/or performance. RESULTS: We present doubletD, the first standalone method for detecting doublets in scDNA-seq data. Underlying our method is a simple maximum likelihood approach with a closed-form solution. We demonstrate the performance of doubletD on simulated data as well as real datasets, outperforming current methods for downstream analysis of scDNA-seq data that jointly infer doublets as well as standalone approaches for doublet detection in scRNA-seq data. Incorporating doubletD in scDNA-seq analysis pipelines will reduce complexity and lead to more accurate results. AVAILABILITY AND IMPLEMENTATION: https://github.com/elkebir-group/doubletD. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-12 /pmc/articles/PMC8275324/ /pubmed/34252961 http://dx.doi.org/10.1093/bioinformatics/btab266 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genome Sequence Analysis
Weber, Leah L
Sashittal, Palash
El-Kebir, Mohammed
doubletD: detecting doublets in single-cell DNA sequencing data
title doubletD: detecting doublets in single-cell DNA sequencing data
title_full doubletD: detecting doublets in single-cell DNA sequencing data
title_fullStr doubletD: detecting doublets in single-cell DNA sequencing data
title_full_unstemmed doubletD: detecting doublets in single-cell DNA sequencing data
title_short doubletD: detecting doublets in single-cell DNA sequencing data
title_sort doubletd: detecting doublets in single-cell dna sequencing data
topic Genome Sequence Analysis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275324/
https://www.ncbi.nlm.nih.gov/pubmed/34252961
http://dx.doi.org/10.1093/bioinformatics/btab266
work_keys_str_mv AT weberleahl doubletddetectingdoubletsinsinglecelldnasequencingdata
AT sashittalpalash doubletddetectingdoubletsinsinglecelldnasequencingdata
AT elkebirmohammed doubletddetectingdoubletsinsinglecelldnasequencingdata