Cargando…

EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data

Droplet-based single-cell RNA sequencing (scRNA-seq) has significantly increased the number of cells profiled per experiment and revolutionized the study of individual transcriptomes. However, to maximize the biological signal, robust computational methods are needed to distinguish cell-free from ce...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Fangfang, Zhao, Zhongming, Simon, Lukas M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369248/
https://www.ncbi.nlm.nih.gov/pubmed/34430929
http://dx.doi.org/10.1016/j.patter.2021.100311
_version_ 1783739252949909504
author Yan, Fangfang
Zhao, Zhongming
Simon, Lukas M.
author_facet Yan, Fangfang
Zhao, Zhongming
Simon, Lukas M.
author_sort Yan, Fangfang
collection PubMed
description Droplet-based single-cell RNA sequencing (scRNA-seq) has significantly increased the number of cells profiled per experiment and revolutionized the study of individual transcriptomes. However, to maximize the biological signal, robust computational methods are needed to distinguish cell-free from cell-containing droplets. Here, we introduce a novel cell-calling algorithm called EmptyNN, which trains a neural network based on positive-unlabeled learning for improved filtering of barcodes. For benchmarking purposes, we leveraged cell hashing and genetic variation to provide ground truth. EmptyNN accurately removed cell-free droplets while recovering lost cell clusters, and achieved an area under the receiver operating characteristics of 94.73% and 96.30%, respectively. Comparisons to current state-of-the-art cell-calling algorithms demonstrated the superior performance of EmptyNN. EmptyNN was further applied to a single-nucleus RNA sequencing (snRNA-seq) dataset and showed good performance. Therefore, EmptyNN represents a powerful tool to enhance both scRNA-seq and snRNA-seq quality control analyses.
format Online
Article
Text
id pubmed-8369248
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-83692482021-08-23 EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data Yan, Fangfang Zhao, Zhongming Simon, Lukas M. Patterns (N Y) Article Droplet-based single-cell RNA sequencing (scRNA-seq) has significantly increased the number of cells profiled per experiment and revolutionized the study of individual transcriptomes. However, to maximize the biological signal, robust computational methods are needed to distinguish cell-free from cell-containing droplets. Here, we introduce a novel cell-calling algorithm called EmptyNN, which trains a neural network based on positive-unlabeled learning for improved filtering of barcodes. For benchmarking purposes, we leveraged cell hashing and genetic variation to provide ground truth. EmptyNN accurately removed cell-free droplets while recovering lost cell clusters, and achieved an area under the receiver operating characteristics of 94.73% and 96.30%, respectively. Comparisons to current state-of-the-art cell-calling algorithms demonstrated the superior performance of EmptyNN. EmptyNN was further applied to a single-nucleus RNA sequencing (snRNA-seq) dataset and showed good performance. Therefore, EmptyNN represents a powerful tool to enhance both scRNA-seq and snRNA-seq quality control analyses. Elsevier 2021-07-20 /pmc/articles/PMC8369248/ /pubmed/34430929 http://dx.doi.org/10.1016/j.patter.2021.100311 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Yan, Fangfang
Zhao, Zhongming
Simon, Lukas M.
EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data
title EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data
title_full EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data
title_fullStr EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data
title_full_unstemmed EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data
title_short EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data
title_sort emptynn: a neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scrna-seq data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369248/
https://www.ncbi.nlm.nih.gov/pubmed/34430929
http://dx.doi.org/10.1016/j.patter.2021.100311
work_keys_str_mv AT yanfangfang emptynnaneuralnetworkbasedonpositiveandunlabeledlearningtoremovecellfreedropletsandrecoverlostcellsinscrnaseqdata
AT zhaozhongming emptynnaneuralnetworkbasedonpositiveandunlabeledlearningtoremovecellfreedropletsandrecoverlostcellsinscrnaseqdata
AT simonlukasm emptynnaneuralnetworkbasedonpositiveandunlabeledlearningtoremovecellfreedropletsandrecoverlostcellsinscrnaseqdata