Cargando…

SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation

A main challenge in analyzing single-cell RNA sequencing (scRNA-seq) data is to reduce technical variations yet retain cell heterogeneity. Due to low mRNAs content per cell and molecule losses during the experiment (called ‘dropout’), the gene expression matrix has a substantial amount of zero read...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Zhirui, Zu, Songpeng, Liu, Jun S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526005/
https://www.ncbi.nlm.nih.gov/pubmed/33029585
http://dx.doi.org/10.1093/nargab/lqaa077
_version_ 1783588790360604672
author Hu, Zhirui
Zu, Songpeng
Liu, Jun S
author_facet Hu, Zhirui
Zu, Songpeng
Liu, Jun S
author_sort Hu, Zhirui
collection PubMed
description A main challenge in analyzing single-cell RNA sequencing (scRNA-seq) data is to reduce technical variations yet retain cell heterogeneity. Due to low mRNAs content per cell and molecule losses during the experiment (called ‘dropout’), the gene expression matrix has a substantial amount of zero read counts. Existing imputation methods treat either each cell or each gene as independently and identically distributed, which oversimplifies the gene correlation and cell type structure. We propose a statistical model-based approach, called SIMPLEs (SIngle-cell RNA-seq iMPutation and celL clustErings), which iteratively identifies correlated gene modules and cell clusters and imputes dropouts customized for individual gene module and cell type. Simultaneously, it quantifies the uncertainty of imputation and cell clustering via multiple imputations. In simulations, SIMPLEs performed significantly better than prevailing scRNA-seq imputation methods according to various metrics. By applying SIMPLEs to several real datasets, we discovered gene modules that can further classify subtypes of cells. Our imputations successfully recovered the expression trends of marker genes in stem cell differentiation and can discover putative pathways regulating biological processes.
format Online
Article
Text
id pubmed-7526005
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-75260052020-10-05 SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation Hu, Zhirui Zu, Songpeng Liu, Jun S NAR Genom Bioinform Methods Article A main challenge in analyzing single-cell RNA sequencing (scRNA-seq) data is to reduce technical variations yet retain cell heterogeneity. Due to low mRNAs content per cell and molecule losses during the experiment (called ‘dropout’), the gene expression matrix has a substantial amount of zero read counts. Existing imputation methods treat either each cell or each gene as independently and identically distributed, which oversimplifies the gene correlation and cell type structure. We propose a statistical model-based approach, called SIMPLEs (SIngle-cell RNA-seq iMPutation and celL clustErings), which iteratively identifies correlated gene modules and cell clusters and imputes dropouts customized for individual gene module and cell type. Simultaneously, it quantifies the uncertainty of imputation and cell clustering via multiple imputations. In simulations, SIMPLEs performed significantly better than prevailing scRNA-seq imputation methods according to various metrics. By applying SIMPLEs to several real datasets, we discovered gene modules that can further classify subtypes of cells. Our imputations successfully recovered the expression trends of marker genes in stem cell differentiation and can discover putative pathways regulating biological processes. Oxford University Press 2020-09-28 /pmc/articles/PMC7526005/ /pubmed/33029585 http://dx.doi.org/10.1093/nargab/lqaa077 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Article
Hu, Zhirui
Zu, Songpeng
Liu, Jun S
SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
title SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
title_full SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
title_fullStr SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
title_full_unstemmed SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
title_short SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
title_sort simples: a single-cell rna sequencing imputation strategy preserving gene modules and cell clusters variation
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526005/
https://www.ncbi.nlm.nih.gov/pubmed/33029585
http://dx.doi.org/10.1093/nargab/lqaa077
work_keys_str_mv AT huzhirui simplesasinglecellrnasequencingimputationstrategypreservinggenemodulesandcellclustersvariation
AT zusongpeng simplesasinglecellrnasequencingimputationstrategypreservinggenemodulesandcellclustersvariation
AT liujuns simplesasinglecellrnasequencingimputationstrategypreservinggenemodulesandcellclustersvariation