Cargando…
Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification
Single-cell RNA sequencing is a powerful technology for obtaining transcriptomes at single-cell resolutions. However, it suffers from dropout events (i.e., excess zero counts) since only a small fraction of transcripts get sequenced in each cell during the sequencing process. This inherent sparsity...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7230610/ https://www.ncbi.nlm.nih.gov/pubmed/32244427 http://dx.doi.org/10.3390/genes11040377 |
_version_ | 1783534994794217472 |
---|---|
author | Zand, Maryam Ruan, Jianhua |
author_facet | Zand, Maryam Ruan, Jianhua |
author_sort | Zand, Maryam |
collection | PubMed |
description | Single-cell RNA sequencing is a powerful technology for obtaining transcriptomes at single-cell resolutions. However, it suffers from dropout events (i.e., excess zero counts) since only a small fraction of transcripts get sequenced in each cell during the sequencing process. This inherent sparsity of expression profiles hinders further characterizations at cell/gene-level such as cell type identification and downstream analysis. To alleviate this dropout issue we introduce a network-based method, netImpute, by leveraging the hidden information in gene co-expression networks to recover real signals. netImpute employs Random Walk with Restart (RWR) to adjust the gene expression level in a given cell by borrowing information from its neighbors in a gene co-expression network. Performance evaluation and comparison with existing tools on simulated data and seven real datasets show that netImpute substantially enhances clustering accuracy and data visualization clarity, thanks to its effective treatment of dropouts. While the idea of netImpute is general and can be applied with other types of networks such as cell co-expression network or protein–protein interaction (PPI) network, evaluation results show that gene co-expression network is consistently more beneficial, presumably because PPI network usually lacks cell type context, while cell co-expression network can cause information loss for rare cell types. Evaluation results on several biological datasets show that netImpute can more effectively recover missing transcripts in scRNA-seq data and enhance the identification and visualization of heterogeneous cell types than existing methods. |
format | Online Article Text |
id | pubmed-7230610 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-72306102020-05-22 Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification Zand, Maryam Ruan, Jianhua Genes (Basel) Article Single-cell RNA sequencing is a powerful technology for obtaining transcriptomes at single-cell resolutions. However, it suffers from dropout events (i.e., excess zero counts) since only a small fraction of transcripts get sequenced in each cell during the sequencing process. This inherent sparsity of expression profiles hinders further characterizations at cell/gene-level such as cell type identification and downstream analysis. To alleviate this dropout issue we introduce a network-based method, netImpute, by leveraging the hidden information in gene co-expression networks to recover real signals. netImpute employs Random Walk with Restart (RWR) to adjust the gene expression level in a given cell by borrowing information from its neighbors in a gene co-expression network. Performance evaluation and comparison with existing tools on simulated data and seven real datasets show that netImpute substantially enhances clustering accuracy and data visualization clarity, thanks to its effective treatment of dropouts. While the idea of netImpute is general and can be applied with other types of networks such as cell co-expression network or protein–protein interaction (PPI) network, evaluation results show that gene co-expression network is consistently more beneficial, presumably because PPI network usually lacks cell type context, while cell co-expression network can cause information loss for rare cell types. Evaluation results on several biological datasets show that netImpute can more effectively recover missing transcripts in scRNA-seq data and enhance the identification and visualization of heterogeneous cell types than existing methods. MDPI 2020-03-31 /pmc/articles/PMC7230610/ /pubmed/32244427 http://dx.doi.org/10.3390/genes11040377 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zand, Maryam Ruan, Jianhua Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification |
title | Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification |
title_full | Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification |
title_fullStr | Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification |
title_full_unstemmed | Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification |
title_short | Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification |
title_sort | network-based single-cell rna-seq data imputation enhances cell type identification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7230610/ https://www.ncbi.nlm.nih.gov/pubmed/32244427 http://dx.doi.org/10.3390/genes11040377 |
work_keys_str_mv | AT zandmaryam networkbasedsinglecellrnaseqdataimputationenhancescelltypeidentification AT ruanjianhua networkbasedsinglecellrnaseqdataimputationenhancescelltypeidentification |