Cargando…

Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS

Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed b...

Descripción completa

Detalles Bibliográficos
Autores principales: Marghi, Yeganeh, Gala, Rohan, Baftizadeh, Fahimeh, Sümbül, Uygar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10592946/
https://www.ncbi.nlm.nih.gov/pubmed/37873271
http://dx.doi.org/10.1101/2023.10.02.560574
_version_ 1785124369023893504
author Marghi, Yeganeh
Gala, Rohan
Baftizadeh, Fahimeh
Sümbül, Uygar
author_facet Marghi, Yeganeh
Gala, Rohan
Baftizadeh, Fahimeh
Sümbül, Uygar
author_sort Marghi, Yeganeh
collection PubMed
description Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed by clustering as a separate step, or immediately apply clustering on the data. Clusters obtained in this manner are considered as putative cell types in atlas-scale efforts such as those for mammalian brains. We show that such approaches can suffer from qualitative mistakes in identifying cell types robustly, particularly when the number of such cell types is in the hundreds or even thousands. Here, we propose an unsupervised method, MMIDAS (Mixture Model Inference with Discrete-coupled AutoencoderS), which combines a generalized mixture model with a multi-armed deep neural network, to jointly infer the discrete type and continuous type-specific variability. We develop this framework in a way that can be applied to analysis of both uni-modal and multi-modal datasets. Using four recent datasets of brain cells spanning different technologies, species, and conditions, we demonstrate that MMIDAS significantly outperforms state-of-the-art models in inferring interpretable discrete and continuous representations of cellular identity, and uncovers novel biological insights. Our unsupervised framework can thus help researchers identify more robust cell types, study cell type-dependent continuous variability, interpret such latent factors in the feature domain, and study multi-modal datasets.
format Online
Article
Text
id pubmed-10592946
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-105929462023-10-24 Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS Marghi, Yeganeh Gala, Rohan Baftizadeh, Fahimeh Sümbül, Uygar bioRxiv Article Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed by clustering as a separate step, or immediately apply clustering on the data. Clusters obtained in this manner are considered as putative cell types in atlas-scale efforts such as those for mammalian brains. We show that such approaches can suffer from qualitative mistakes in identifying cell types robustly, particularly when the number of such cell types is in the hundreds or even thousands. Here, we propose an unsupervised method, MMIDAS (Mixture Model Inference with Discrete-coupled AutoencoderS), which combines a generalized mixture model with a multi-armed deep neural network, to jointly infer the discrete type and continuous type-specific variability. We develop this framework in a way that can be applied to analysis of both uni-modal and multi-modal datasets. Using four recent datasets of brain cells spanning different technologies, species, and conditions, we demonstrate that MMIDAS significantly outperforms state-of-the-art models in inferring interpretable discrete and continuous representations of cellular identity, and uncovers novel biological insights. Our unsupervised framework can thus help researchers identify more robust cell types, study cell type-dependent continuous variability, interpret such latent factors in the feature domain, and study multi-modal datasets. Cold Spring Harbor Laboratory 2023-10-02 /pmc/articles/PMC10592946/ /pubmed/37873271 http://dx.doi.org/10.1101/2023.10.02.560574 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Marghi, Yeganeh
Gala, Rohan
Baftizadeh, Fahimeh
Sümbül, Uygar
Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
title Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
title_full Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
title_fullStr Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
title_full_unstemmed Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
title_short Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
title_sort joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with mmidas
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10592946/
https://www.ncbi.nlm.nih.gov/pubmed/37873271
http://dx.doi.org/10.1101/2023.10.02.560574
work_keys_str_mv AT marghiyeganeh jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas
AT galarohan jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas
AT baftizadehfahimeh jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas
AT sumbuluygar jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas