Cargando…
SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation
A main challenge in analyzing single-cell RNA sequencing (scRNA-seq) data is to reduce technical variations yet retain cell heterogeneity. Due to low mRNAs content per cell and molecule losses during the experiment (called ‘dropout’), the gene expression matrix has a substantial amount of zero read...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526005/ https://www.ncbi.nlm.nih.gov/pubmed/33029585 http://dx.doi.org/10.1093/nargab/lqaa077 |
_version_ | 1783588790360604672 |
---|---|
author | Hu, Zhirui Zu, Songpeng Liu, Jun S |
author_facet | Hu, Zhirui Zu, Songpeng Liu, Jun S |
author_sort | Hu, Zhirui |
collection | PubMed |
description | A main challenge in analyzing single-cell RNA sequencing (scRNA-seq) data is to reduce technical variations yet retain cell heterogeneity. Due to low mRNAs content per cell and molecule losses during the experiment (called ‘dropout’), the gene expression matrix has a substantial amount of zero read counts. Existing imputation methods treat either each cell or each gene as independently and identically distributed, which oversimplifies the gene correlation and cell type structure. We propose a statistical model-based approach, called SIMPLEs (SIngle-cell RNA-seq iMPutation and celL clustErings), which iteratively identifies correlated gene modules and cell clusters and imputes dropouts customized for individual gene module and cell type. Simultaneously, it quantifies the uncertainty of imputation and cell clustering via multiple imputations. In simulations, SIMPLEs performed significantly better than prevailing scRNA-seq imputation methods according to various metrics. By applying SIMPLEs to several real datasets, we discovered gene modules that can further classify subtypes of cells. Our imputations successfully recovered the expression trends of marker genes in stem cell differentiation and can discover putative pathways regulating biological processes. |
format | Online Article Text |
id | pubmed-7526005 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-75260052020-10-05 SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation Hu, Zhirui Zu, Songpeng Liu, Jun S NAR Genom Bioinform Methods Article A main challenge in analyzing single-cell RNA sequencing (scRNA-seq) data is to reduce technical variations yet retain cell heterogeneity. Due to low mRNAs content per cell and molecule losses during the experiment (called ‘dropout’), the gene expression matrix has a substantial amount of zero read counts. Existing imputation methods treat either each cell or each gene as independently and identically distributed, which oversimplifies the gene correlation and cell type structure. We propose a statistical model-based approach, called SIMPLEs (SIngle-cell RNA-seq iMPutation and celL clustErings), which iteratively identifies correlated gene modules and cell clusters and imputes dropouts customized for individual gene module and cell type. Simultaneously, it quantifies the uncertainty of imputation and cell clustering via multiple imputations. In simulations, SIMPLEs performed significantly better than prevailing scRNA-seq imputation methods according to various metrics. By applying SIMPLEs to several real datasets, we discovered gene modules that can further classify subtypes of cells. Our imputations successfully recovered the expression trends of marker genes in stem cell differentiation and can discover putative pathways regulating biological processes. Oxford University Press 2020-09-28 /pmc/articles/PMC7526005/ /pubmed/33029585 http://dx.doi.org/10.1093/nargab/lqaa077 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Article Hu, Zhirui Zu, Songpeng Liu, Jun S SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation |
title | SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation |
title_full | SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation |
title_fullStr | SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation |
title_full_unstemmed | SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation |
title_short | SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation |
title_sort | simples: a single-cell rna sequencing imputation strategy preserving gene modules and cell clusters variation |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526005/ https://www.ncbi.nlm.nih.gov/pubmed/33029585 http://dx.doi.org/10.1093/nargab/lqaa077 |
work_keys_str_mv | AT huzhirui simplesasinglecellrnasequencingimputationstrategypreservinggenemodulesandcellclustersvariation AT zusongpeng simplesasinglecellrnasequencingimputationstrategypreservinggenemodulesandcellclustersvariation AT liujuns simplesasinglecellrnasequencingimputationstrategypreservinggenemodulesandcellclustersvariation |