Cargando…

Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts

Single-cell RNA sequencing (scRNA-seq) provides a powerful tool to determine expression patterns of thousands of individual cells. However, the analysis of scRNA-seq data remains a computational challenge due to the high technical noise such as the presence of dropout events that lead to a large pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lihua, Zhang, Shihua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8035992/
https://www.ncbi.nlm.nih.gov/pubmed/33002136
http://dx.doi.org/10.1093/jmcb/mjaa052
_version_ 1783676812788760576
author Zhang, Lihua
Zhang, Shihua
author_facet Zhang, Lihua
Zhang, Shihua
author_sort Zhang, Lihua
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) provides a powerful tool to determine expression patterns of thousands of individual cells. However, the analysis of scRNA-seq data remains a computational challenge due to the high technical noise such as the presence of dropout events that lead to a large proportion of zeros for expressed genes. Taking into account the cell heterogeneity and the relationship between dropout rate and expected expression level, we present a cell sub-population based bounded low-rank (PBLR) method to impute the dropouts of scRNA-seq data. Through application to both simulated and real scRNA-seq datasets, PBLR is shown to be effective in recovering dropout events, and it can dramatically improve the low-dimensional representation and the recovery of gene‒gene relationships masked by dropout events compared to several state-of-the-art methods. Moreover, PBLR also detects accurate and robust cell sub-populations automatically, shedding light on its flexibility and generality for scRNA-seq data analysis.
format Online
Article
Text
id pubmed-8035992
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-80359922021-04-14 Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts Zhang, Lihua Zhang, Shihua J Mol Cell Biol Articles Single-cell RNA sequencing (scRNA-seq) provides a powerful tool to determine expression patterns of thousands of individual cells. However, the analysis of scRNA-seq data remains a computational challenge due to the high technical noise such as the presence of dropout events that lead to a large proportion of zeros for expressed genes. Taking into account the cell heterogeneity and the relationship between dropout rate and expected expression level, we present a cell sub-population based bounded low-rank (PBLR) method to impute the dropouts of scRNA-seq data. Through application to both simulated and real scRNA-seq datasets, PBLR is shown to be effective in recovering dropout events, and it can dramatically improve the low-dimensional representation and the recovery of gene‒gene relationships masked by dropout events compared to several state-of-the-art methods. Moreover, PBLR also detects accurate and robust cell sub-populations automatically, shedding light on its flexibility and generality for scRNA-seq data analysis. Oxford University Press 2020-10-01 /pmc/articles/PMC8035992/ /pubmed/33002136 http://dx.doi.org/10.1093/jmcb/mjaa052 Text en © The Author(s) (2020). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Zhang, Lihua
Zhang, Shihua
Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts
title Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts
title_full Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts
title_fullStr Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts
title_full_unstemmed Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts
title_short Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts
title_sort imputing single-cell rna-seq data by considering cell heterogeneity and prior expression of dropouts
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8035992/
https://www.ncbi.nlm.nih.gov/pubmed/33002136
http://dx.doi.org/10.1093/jmcb/mjaa052
work_keys_str_mv AT zhanglihua imputingsinglecellrnaseqdatabyconsideringcellheterogeneityandpriorexpressionofdropouts
AT zhangshihua imputingsinglecellrnaseqdatabyconsideringcellheterogeneityandpriorexpressionofdropouts