Cargando…

Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity

Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq d...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Meng, Ye, Xiucai, Li, Hongmin, Sakurai, Tetsuya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9329700/
https://www.ncbi.nlm.nih.gov/pubmed/35910201
http://dx.doi.org/10.3389/fgene.2022.952649
_version_ 1784757977810468864
author Huang, Meng
Ye, Xiucai
Li, Hongmin
Sakurai, Tetsuya
author_facet Huang, Meng
Ye, Xiucai
Li, Hongmin
Sakurai, Tetsuya
author_sort Huang, Meng
collection PubMed
description Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq data, termed as dropout events. These zero values complicate the analysis of cell patterns, which affects the high-precision analysis of intra-tumor heterogeneity. Recovering missing gene expression values is still a major obstacle in the scRNA-seq data analysis. In this study, taking the cell heterogeneity into consideration, we develop a novel method, called single cell Gauss–Newton Gene expression Imputation (scGNGI), to impute the scRNA-seq expression matrices by using a low-rank matrix completion. The obtained experimental results on the simulated datasets and real scRNA-seq datasets show that scGNGI can more effectively impute the missing values for scRNA-seq gene expression and improve the down-stream analysis compared to other state-of-the-art methods. Moreover, we show that the proposed method can better preserve gene expression variability among cells. Overall, this study helps explore the complex biological system and precision medicine in scRNA-seq data.
format Online
Article
Text
id pubmed-9329700
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93297002022-07-29 Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity Huang, Meng Ye, Xiucai Li, Hongmin Sakurai, Tetsuya Front Genet Genetics Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq data, termed as dropout events. These zero values complicate the analysis of cell patterns, which affects the high-precision analysis of intra-tumor heterogeneity. Recovering missing gene expression values is still a major obstacle in the scRNA-seq data analysis. In this study, taking the cell heterogeneity into consideration, we develop a novel method, called single cell Gauss–Newton Gene expression Imputation (scGNGI), to impute the scRNA-seq expression matrices by using a low-rank matrix completion. The obtained experimental results on the simulated datasets and real scRNA-seq datasets show that scGNGI can more effectively impute the missing values for scRNA-seq gene expression and improve the down-stream analysis compared to other state-of-the-art methods. Moreover, we show that the proposed method can better preserve gene expression variability among cells. Overall, this study helps explore the complex biological system and precision medicine in scRNA-seq data. Frontiers Media S.A. 2022-07-14 /pmc/articles/PMC9329700/ /pubmed/35910201 http://dx.doi.org/10.3389/fgene.2022.952649 Text en Copyright © 2022 Huang, Ye, Li and Sakurai. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Huang, Meng
Ye, Xiucai
Li, Hongmin
Sakurai, Tetsuya
Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity
title Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity
title_full Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity
title_fullStr Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity
title_full_unstemmed Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity
title_short Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity
title_sort missing value imputation with low-rank matrix completion in single-cell rna-seq data by considering cell heterogeneity
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9329700/
https://www.ncbi.nlm.nih.gov/pubmed/35910201
http://dx.doi.org/10.3389/fgene.2022.952649
work_keys_str_mv AT huangmeng missingvalueimputationwithlowrankmatrixcompletioninsinglecellrnaseqdatabyconsideringcellheterogeneity
AT yexiucai missingvalueimputationwithlowrankmatrixcompletioninsinglecellrnaseqdatabyconsideringcellheterogeneity
AT lihongmin missingvalueimputationwithlowrankmatrixcompletioninsinglecellrnaseqdatabyconsideringcellheterogeneity
AT sakuraitetsuya missingvalueimputationwithlowrankmatrixcompletioninsinglecellrnaseqdatabyconsideringcellheterogeneity