Cargando…
Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS
Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed b...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10592946/ https://www.ncbi.nlm.nih.gov/pubmed/37873271 http://dx.doi.org/10.1101/2023.10.02.560574 |
_version_ | 1785124369023893504 |
---|---|
author | Marghi, Yeganeh Gala, Rohan Baftizadeh, Fahimeh Sümbül, Uygar |
author_facet | Marghi, Yeganeh Gala, Rohan Baftizadeh, Fahimeh Sümbül, Uygar |
author_sort | Marghi, Yeganeh |
collection | PubMed |
description | Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed by clustering as a separate step, or immediately apply clustering on the data. Clusters obtained in this manner are considered as putative cell types in atlas-scale efforts such as those for mammalian brains. We show that such approaches can suffer from qualitative mistakes in identifying cell types robustly, particularly when the number of such cell types is in the hundreds or even thousands. Here, we propose an unsupervised method, MMIDAS (Mixture Model Inference with Discrete-coupled AutoencoderS), which combines a generalized mixture model with a multi-armed deep neural network, to jointly infer the discrete type and continuous type-specific variability. We develop this framework in a way that can be applied to analysis of both uni-modal and multi-modal datasets. Using four recent datasets of brain cells spanning different technologies, species, and conditions, we demonstrate that MMIDAS significantly outperforms state-of-the-art models in inferring interpretable discrete and continuous representations of cellular identity, and uncovers novel biological insights. Our unsupervised framework can thus help researchers identify more robust cell types, study cell type-dependent continuous variability, interpret such latent factors in the feature domain, and study multi-modal datasets. |
format | Online Article Text |
id | pubmed-10592946 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-105929462023-10-24 Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS Marghi, Yeganeh Gala, Rohan Baftizadeh, Fahimeh Sümbül, Uygar bioRxiv Article Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed by clustering as a separate step, or immediately apply clustering on the data. Clusters obtained in this manner are considered as putative cell types in atlas-scale efforts such as those for mammalian brains. We show that such approaches can suffer from qualitative mistakes in identifying cell types robustly, particularly when the number of such cell types is in the hundreds or even thousands. Here, we propose an unsupervised method, MMIDAS (Mixture Model Inference with Discrete-coupled AutoencoderS), which combines a generalized mixture model with a multi-armed deep neural network, to jointly infer the discrete type and continuous type-specific variability. We develop this framework in a way that can be applied to analysis of both uni-modal and multi-modal datasets. Using four recent datasets of brain cells spanning different technologies, species, and conditions, we demonstrate that MMIDAS significantly outperforms state-of-the-art models in inferring interpretable discrete and continuous representations of cellular identity, and uncovers novel biological insights. Our unsupervised framework can thus help researchers identify more robust cell types, study cell type-dependent continuous variability, interpret such latent factors in the feature domain, and study multi-modal datasets. Cold Spring Harbor Laboratory 2023-10-02 /pmc/articles/PMC10592946/ /pubmed/37873271 http://dx.doi.org/10.1101/2023.10.02.560574 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Marghi, Yeganeh Gala, Rohan Baftizadeh, Fahimeh Sümbül, Uygar Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS |
title | Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS |
title_full | Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS |
title_fullStr | Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS |
title_full_unstemmed | Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS |
title_short | Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS |
title_sort | joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with mmidas |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10592946/ https://www.ncbi.nlm.nih.gov/pubmed/37873271 http://dx.doi.org/10.1101/2023.10.02.560574 |
work_keys_str_mv | AT marghiyeganeh jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas AT galarohan jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas AT baftizadehfahimeh jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas AT sumbuluygar jointinferenceofdiscretecelltypesandcontinuoustypespecificvariabilityinsinglecelldatasetswithmmidas |