Cargando…
Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data
Single-cell RNA-seq (scRNA-seq) is quite prevalent in studying transcriptomes, but it suffers from excessive zeros, some of which are true, but others are false. False zeros, which can be seen as missing data, obstruct the downstream analysis of single-cell RNA-seq data. How to distinguish true zero...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7291078/ https://www.ncbi.nlm.nih.gov/pubmed/32403260 http://dx.doi.org/10.3390/genes11050532 |
_version_ | 1783545825944666112 |
---|---|
author | Chi, Weilai Deng, Minghua |
author_facet | Chi, Weilai Deng, Minghua |
author_sort | Chi, Weilai |
collection | PubMed |
description | Single-cell RNA-seq (scRNA-seq) is quite prevalent in studying transcriptomes, but it suffers from excessive zeros, some of which are true, but others are false. False zeros, which can be seen as missing data, obstruct the downstream analysis of single-cell RNA-seq data. How to distinguish true zeros from false ones is the key point of this problem. Here, we propose sparsity-penalized stacked denoising autoencoders (scSDAEs) to impute scRNA-seq data. scSDAEs adopt stacked denoising autoencoders with a sparsity penalty, as well as a layer-wise pretraining procedure to improve model fitting. scSDAEs can capture nonlinear relationships among the data and incorporate information about the observed zeros. We tested the imputation efficiency of scSDAEs on recovering the true values of gene expression and helping downstream analysis. First, we show that scSDAE can recover the true values and the sample–sample correlations of bulk sequencing data with simulated noise. Next, we demonstrate that scSDAEs accurately impute RNA mixture dataset with different dilutions, spike-in RNA concentrations affected by technical zeros, and improves the consistency of RNA and protein levels in CITE-seq data. Finally, we show that scSDAEs can help downstream clustering analysis. In this study, we develop a deep learning-based method, scSDAE, to impute single-cell RNA-seq affected by technical zeros. Furthermore, we show that scSDAEs can recover the true values, to some extent, and help downstream analysis. |
format | Online Article Text |
id | pubmed-7291078 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-72910782020-06-19 Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data Chi, Weilai Deng, Minghua Genes (Basel) Article Single-cell RNA-seq (scRNA-seq) is quite prevalent in studying transcriptomes, but it suffers from excessive zeros, some of which are true, but others are false. False zeros, which can be seen as missing data, obstruct the downstream analysis of single-cell RNA-seq data. How to distinguish true zeros from false ones is the key point of this problem. Here, we propose sparsity-penalized stacked denoising autoencoders (scSDAEs) to impute scRNA-seq data. scSDAEs adopt stacked denoising autoencoders with a sparsity penalty, as well as a layer-wise pretraining procedure to improve model fitting. scSDAEs can capture nonlinear relationships among the data and incorporate information about the observed zeros. We tested the imputation efficiency of scSDAEs on recovering the true values of gene expression and helping downstream analysis. First, we show that scSDAE can recover the true values and the sample–sample correlations of bulk sequencing data with simulated noise. Next, we demonstrate that scSDAEs accurately impute RNA mixture dataset with different dilutions, spike-in RNA concentrations affected by technical zeros, and improves the consistency of RNA and protein levels in CITE-seq data. Finally, we show that scSDAEs can help downstream clustering analysis. In this study, we develop a deep learning-based method, scSDAE, to impute single-cell RNA-seq affected by technical zeros. Furthermore, we show that scSDAEs can recover the true values, to some extent, and help downstream analysis. MDPI 2020-05-11 /pmc/articles/PMC7291078/ /pubmed/32403260 http://dx.doi.org/10.3390/genes11050532 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Chi, Weilai Deng, Minghua Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data |
title | Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data |
title_full | Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data |
title_fullStr | Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data |
title_full_unstemmed | Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data |
title_short | Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data |
title_sort | sparsity-penalized stacked denoising autoencoders for imputing single-cell rna-seq data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7291078/ https://www.ncbi.nlm.nih.gov/pubmed/32403260 http://dx.doi.org/10.3390/genes11050532 |
work_keys_str_mv | AT chiweilai sparsitypenalizedstackeddenoisingautoencodersforimputingsinglecellrnaseqdata AT dengminghua sparsitypenalizedstackeddenoisingautoencodersforimputingsinglecellrnaseqdata |