Cargando…

ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization

Single-cell RNA-sequencing (scRNA-seq) data can serve as a good indicator of cell-to-cell heterogeneity and can aid in the study of cell growth by identifying cell types. Recently, advances in Variational Autoencoder (VAE) have demonstrated their ability to learn robust feature representations for s...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Weiquan, Long, Faning, Pan, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257850/
https://www.ncbi.nlm.nih.gov/pubmed/37301826
http://dx.doi.org/10.1186/s13040-023-00333-1
_version_ 1785057376267665408
author Pan, Weiquan
Long, Faning
Pan, Jian
author_facet Pan, Weiquan
Long, Faning
Pan, Jian
author_sort Pan, Weiquan
collection PubMed
description Single-cell RNA-sequencing (scRNA-seq) data can serve as a good indicator of cell-to-cell heterogeneity and can aid in the study of cell growth by identifying cell types. Recently, advances in Variational Autoencoder (VAE) have demonstrated their ability to learn robust feature representations for scRNA-seq. However, it has been observed that VAEs tend to ignore the latent variables when combined with a decoding distribution that is too flexible. In this paper, we introduce ScInfoVAE, a dimensional reduction method based on the mutual information variational autoencoder (InfoVAE), which can more effectively identify various cell types in scRNA-seq data of complex tissues. A joint InfoVAE deep model and zero-inflated negative binomial distributed model design based on ScInfoVAE reconstructs the objective function to noise scRNA-seq data and learn an efficient low-dimensional representation of it. We use ScInfoVAE to analyze the clustering performance of 15 real scRNA-seq datasets and demonstrate that our method provides high clustering performance. In addition, we use simulated data to investigate the interpretability of feature extraction, and visualization results show that the low-dimensional representation learned by ScInfoVAE retains local and global neighborhood structure data well. In addition, our model can significantly improve the quality of the variational posterior. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-023-00333-1.
format Online
Article
Text
id pubmed-10257850
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-102578502023-06-12 ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization Pan, Weiquan Long, Faning Pan, Jian BioData Min Research Single-cell RNA-sequencing (scRNA-seq) data can serve as a good indicator of cell-to-cell heterogeneity and can aid in the study of cell growth by identifying cell types. Recently, advances in Variational Autoencoder (VAE) have demonstrated their ability to learn robust feature representations for scRNA-seq. However, it has been observed that VAEs tend to ignore the latent variables when combined with a decoding distribution that is too flexible. In this paper, we introduce ScInfoVAE, a dimensional reduction method based on the mutual information variational autoencoder (InfoVAE), which can more effectively identify various cell types in scRNA-seq data of complex tissues. A joint InfoVAE deep model and zero-inflated negative binomial distributed model design based on ScInfoVAE reconstructs the objective function to noise scRNA-seq data and learn an efficient low-dimensional representation of it. We use ScInfoVAE to analyze the clustering performance of 15 real scRNA-seq datasets and demonstrate that our method provides high clustering performance. In addition, we use simulated data to investigate the interpretability of feature extraction, and visualization results show that the low-dimensional representation learned by ScInfoVAE retains local and global neighborhood structure data well. In addition, our model can significantly improve the quality of the variational posterior. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-023-00333-1. BioMed Central 2023-06-10 /pmc/articles/PMC10257850/ /pubmed/37301826 http://dx.doi.org/10.1186/s13040-023-00333-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Pan, Weiquan
Long, Faning
Pan, Jian
ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
title ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
title_full ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
title_fullStr ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
title_full_unstemmed ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
title_short ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
title_sort scinfovae: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257850/
https://www.ncbi.nlm.nih.gov/pubmed/37301826
http://dx.doi.org/10.1186/s13040-023-00333-1
work_keys_str_mv AT panweiquan scinfovaeinterpretabledimensionalreductionofsinglecelltranscriptiondatawithvariationalautoencodersandextendedmutualinformationregularization
AT longfaning scinfovaeinterpretabledimensionalreductionofsinglecelltranscriptiondatawithvariationalautoencodersandextendedmutualinformationregularization
AT panjian scinfovaeinterpretabledimensionalreductionofsinglecelltranscriptiondatawithvariationalautoencodersandextendedmutualinformationregularization