Cargando…

An efficient scRNA-seq dropout imputation method using graph attention network

BACKGROUND: Single-cell sequencing technology can address the amount of single-cell library data at the same time and display the heterogeneity of different cells. However, analyzing single-cell data is a computationally challenging problem. Because there are low counts in the gene expression region...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Chenyang, Cai, Lei, Gao, Jingyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8650344/
https://www.ncbi.nlm.nih.gov/pubmed/34876032
http://dx.doi.org/10.1186/s12859-021-04493-x
_version_ 1784611180806930432
author Xu, Chenyang
Cai, Lei
Gao, Jingyang
author_facet Xu, Chenyang
Cai, Lei
Gao, Jingyang
author_sort Xu, Chenyang
collection PubMed
description BACKGROUND: Single-cell sequencing technology can address the amount of single-cell library data at the same time and display the heterogeneity of different cells. However, analyzing single-cell data is a computationally challenging problem. Because there are low counts in the gene expression region, it has a high chance of recognizing the non-zero entity as zero, which are called dropout events. At present, the mainstream dropout imputation methods cannot effectively recover the true expression of cells from dropout noise such as DCA, MAGIC, scVI, scImpute and SAVER. RESULTS: In this paper, we propose an autoencoder structure network, named GNNImpute. GNNImpute uses graph attention convolution to aggregate multi-level similar cell information and implements convolution operations on non-Euclidean space on scRNA-seq data. Distinct from current imputation tools, GNNImpute can accurately and effectively impute the dropout and reduce dropout noise. We use mean square error (MSE), mean absolute error (MAE), Pearson correlation coefficient (PCC) and Cosine similarity (CS) to measure the performance of different methods with GNNImpute. We analyze four real datasets, and our results show that the GNNImpute achieves 3.0130 MSE, 0.6781 MAE, 0.9073 PCC and 0.9134 CS. Furthermore, we use Adjusted rand index (ARI) and Normalized mutual information (NMI) to measure the clustering effect. The GNNImpute achieves 0.8199 (ARI) and 0.8368 (NMI), respectively. CONCLUSIONS: In this investigation, we propose a single-cell dropout imputation method (GNNImpute), which effectively utilizes shared information for imputing the dropout of scRNA-seq data. We test it with different real datasets and evaluate its effectiveness in MSE, MAE, PCC and CS. The results show that graph attention convolution and autoencoder structure have great potential in single-cell dropout imputation.
format Online
Article
Text
id pubmed-8650344
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86503442021-12-07 An efficient scRNA-seq dropout imputation method using graph attention network Xu, Chenyang Cai, Lei Gao, Jingyang BMC Bioinformatics Methodology Article BACKGROUND: Single-cell sequencing technology can address the amount of single-cell library data at the same time and display the heterogeneity of different cells. However, analyzing single-cell data is a computationally challenging problem. Because there are low counts in the gene expression region, it has a high chance of recognizing the non-zero entity as zero, which are called dropout events. At present, the mainstream dropout imputation methods cannot effectively recover the true expression of cells from dropout noise such as DCA, MAGIC, scVI, scImpute and SAVER. RESULTS: In this paper, we propose an autoencoder structure network, named GNNImpute. GNNImpute uses graph attention convolution to aggregate multi-level similar cell information and implements convolution operations on non-Euclidean space on scRNA-seq data. Distinct from current imputation tools, GNNImpute can accurately and effectively impute the dropout and reduce dropout noise. We use mean square error (MSE), mean absolute error (MAE), Pearson correlation coefficient (PCC) and Cosine similarity (CS) to measure the performance of different methods with GNNImpute. We analyze four real datasets, and our results show that the GNNImpute achieves 3.0130 MSE, 0.6781 MAE, 0.9073 PCC and 0.9134 CS. Furthermore, we use Adjusted rand index (ARI) and Normalized mutual information (NMI) to measure the clustering effect. The GNNImpute achieves 0.8199 (ARI) and 0.8368 (NMI), respectively. CONCLUSIONS: In this investigation, we propose a single-cell dropout imputation method (GNNImpute), which effectively utilizes shared information for imputing the dropout of scRNA-seq data. We test it with different real datasets and evaluate its effectiveness in MSE, MAE, PCC and CS. The results show that graph attention convolution and autoencoder structure have great potential in single-cell dropout imputation. BioMed Central 2021-12-07 /pmc/articles/PMC8650344/ /pubmed/34876032 http://dx.doi.org/10.1186/s12859-021-04493-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Xu, Chenyang
Cai, Lei
Gao, Jingyang
An efficient scRNA-seq dropout imputation method using graph attention network
title An efficient scRNA-seq dropout imputation method using graph attention network
title_full An efficient scRNA-seq dropout imputation method using graph attention network
title_fullStr An efficient scRNA-seq dropout imputation method using graph attention network
title_full_unstemmed An efficient scRNA-seq dropout imputation method using graph attention network
title_short An efficient scRNA-seq dropout imputation method using graph attention network
title_sort efficient scrna-seq dropout imputation method using graph attention network
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8650344/
https://www.ncbi.nlm.nih.gov/pubmed/34876032
http://dx.doi.org/10.1186/s12859-021-04493-x
work_keys_str_mv AT xuchenyang anefficientscrnaseqdropoutimputationmethodusinggraphattentionnetwork
AT cailei anefficientscrnaseqdropoutimputationmethodusinggraphattentionnetwork
AT gaojingyang anefficientscrnaseqdropoutimputationmethodusinggraphattentionnetwork
AT xuchenyang efficientscrnaseqdropoutimputationmethodusinggraphattentionnetwork
AT cailei efficientscrnaseqdropoutimputationmethodusinggraphattentionnetwork
AT gaojingyang efficientscrnaseqdropoutimputationmethodusinggraphattentionnetwork