Cargando…

resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles

Feature identification and manual inspection is currently still an integral part of biological data analysis in single-cell sequencing. Features such as expressed genes and open chromatin status are selectively studied in specific contexts, cell states or experimental conditions. While conventional...

Descripción completa

Detalles Bibliográficos
Autores principales: Ten, Foo Wei, Yuan, Dongsheng, Jabareen, Nabil, Phua, Yin Jun, Eils, Roland, Lukassen, Sören, Conrad, Christian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9975353/
https://www.ncbi.nlm.nih.gov/pubmed/36875765
http://dx.doi.org/10.3389/fcell.2023.1091047
_version_ 1784898858794352640
author Ten, Foo Wei
Yuan, Dongsheng
Jabareen, Nabil
Phua, Yin Jun
Eils, Roland
Lukassen, Sören
Conrad, Christian
author_facet Ten, Foo Wei
Yuan, Dongsheng
Jabareen, Nabil
Phua, Yin Jun
Eils, Roland
Lukassen, Sören
Conrad, Christian
author_sort Ten, Foo Wei
collection PubMed
description Feature identification and manual inspection is currently still an integral part of biological data analysis in single-cell sequencing. Features such as expressed genes and open chromatin status are selectively studied in specific contexts, cell states or experimental conditions. While conventional analysis methods construct a relatively static view on gene candidates, artificial neural networks have been used to model their interactions after hierarchical gene regulatory networks. However, it is challenging to identify consistent features in this modeling process due to the inherently stochastic nature of these methods. Therefore, we propose using ensembles of autoencoders and subsequent rank aggregation to extract consensus features in a less biased manner. Here, we performed sequencing data analyses of different modalities either independently or simultaneously as well as with other analysis tools. Our resVAE ensemble method can successfully complement and find additional unbiased biological insights with minimal data processing or feature selection steps while giving a measurement of confidence, especially for models using stochastic or approximation algorithms. In addition, our method can also work with overlapping clustering identity assignment suitable for transitionary cell types or cell fates in comparison to most conventional tools.
format Online
Article
Text
id pubmed-9975353
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99753532023-03-02 resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles Ten, Foo Wei Yuan, Dongsheng Jabareen, Nabil Phua, Yin Jun Eils, Roland Lukassen, Sören Conrad, Christian Front Cell Dev Biol Cell and Developmental Biology Feature identification and manual inspection is currently still an integral part of biological data analysis in single-cell sequencing. Features such as expressed genes and open chromatin status are selectively studied in specific contexts, cell states or experimental conditions. While conventional analysis methods construct a relatively static view on gene candidates, artificial neural networks have been used to model their interactions after hierarchical gene regulatory networks. However, it is challenging to identify consistent features in this modeling process due to the inherently stochastic nature of these methods. Therefore, we propose using ensembles of autoencoders and subsequent rank aggregation to extract consensus features in a less biased manner. Here, we performed sequencing data analyses of different modalities either independently or simultaneously as well as with other analysis tools. Our resVAE ensemble method can successfully complement and find additional unbiased biological insights with minimal data processing or feature selection steps while giving a measurement of confidence, especially for models using stochastic or approximation algorithms. In addition, our method can also work with overlapping clustering identity assignment suitable for transitionary cell types or cell fates in comparison to most conventional tools. Frontiers Media S.A. 2023-02-15 /pmc/articles/PMC9975353/ /pubmed/36875765 http://dx.doi.org/10.3389/fcell.2023.1091047 Text en Copyright © 2023 Ten, Yuan, Jabareen, Phua, Eils, Lukassen and Conrad. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cell and Developmental Biology
Ten, Foo Wei
Yuan, Dongsheng
Jabareen, Nabil
Phua, Yin Jun
Eils, Roland
Lukassen, Sören
Conrad, Christian
resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
title resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
title_full resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
title_fullStr resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
title_full_unstemmed resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
title_short resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
title_sort resvae ensemble: unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles
topic Cell and Developmental Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9975353/
https://www.ncbi.nlm.nih.gov/pubmed/36875765
http://dx.doi.org/10.3389/fcell.2023.1091047
work_keys_str_mv AT tenfoowei resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles
AT yuandongsheng resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles
AT jabareennabil resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles
AT phuayinjun resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles
AT eilsroland resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles
AT lukassensoren resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles
AT conradchristian resvaeensembleunsupervisedidentificationofgenesetsinmultimodalsinglecellsequencingdatausingdeepensembles