Cargando…

GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry

Cell subtype identification from mass cytometry data presents a persisting challenge, particularly when dealing with millions of cells. Current solutions are consistently under development, however, their accuracy and sensitivity remain limited, particularly in rare cell-type detection due to freque...

Descripción completa

Detalles Bibliográficos
Autores principales: Suwalska, Aleksandra, Polanska, Joanna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10531342/
https://www.ncbi.nlm.nih.gov/pubmed/37762336
http://dx.doi.org/10.3390/ijms241814033
_version_ 1785111695941697536
author Suwalska, Aleksandra
Polanska, Joanna
author_facet Suwalska, Aleksandra
Polanska, Joanna
author_sort Suwalska, Aleksandra
collection PubMed
description Cell subtype identification from mass cytometry data presents a persisting challenge, particularly when dealing with millions of cells. Current solutions are consistently under development, however, their accuracy and sensitivity remain limited, particularly in rare cell-type detection due to frequent downsampling. Additionally, they often lack the capability to analyze large data sets. To overcome these limitations, a new method was suggested to define an extended feature space. When combined with the robust clustering algorithm for big data, it results in more efficient cell clustering. Each marker’s intensity distribution is presented as a mixture of normal distributions (Gaussian Mixture Model, GMM), and the expanded space is created by spanning over all obtained GMM components. The projection of the initial flow cytometry marker domain into the expanded space employs GMM-based membership functions. An evaluation conducted on three established cellular identification algorithms (FlowSOM, ClusterX, and PARC) utilizing the most substantial publicly available annotated dataset by Samusik et al. demonstrated the superior performance of the suggested approach in comparison to the standard. Although our approach identified 20 cell clusters instead of the expected 24, their intra-cluster homogeneity and inter-cluster differences were superior to the 24-cluster FlowSOM-based solution.
format Online
Article
Text
id pubmed-10531342
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105313422023-09-28 GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry Suwalska, Aleksandra Polanska, Joanna Int J Mol Sci Article Cell subtype identification from mass cytometry data presents a persisting challenge, particularly when dealing with millions of cells. Current solutions are consistently under development, however, their accuracy and sensitivity remain limited, particularly in rare cell-type detection due to frequent downsampling. Additionally, they often lack the capability to analyze large data sets. To overcome these limitations, a new method was suggested to define an extended feature space. When combined with the robust clustering algorithm for big data, it results in more efficient cell clustering. Each marker’s intensity distribution is presented as a mixture of normal distributions (Gaussian Mixture Model, GMM), and the expanded space is created by spanning over all obtained GMM components. The projection of the initial flow cytometry marker domain into the expanded space employs GMM-based membership functions. An evaluation conducted on three established cellular identification algorithms (FlowSOM, ClusterX, and PARC) utilizing the most substantial publicly available annotated dataset by Samusik et al. demonstrated the superior performance of the suggested approach in comparison to the standard. Although our approach identified 20 cell clusters instead of the expected 24, their intra-cluster homogeneity and inter-cluster differences were superior to the 24-cluster FlowSOM-based solution. MDPI 2023-09-13 /pmc/articles/PMC10531342/ /pubmed/37762336 http://dx.doi.org/10.3390/ijms241814033 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Suwalska, Aleksandra
Polanska, Joanna
GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry
title GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry
title_full GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry
title_fullStr GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry
title_full_unstemmed GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry
title_short GMM-Based Expanded Feature Space as a Way to Extract Useful Information for Rare Cell Subtypes Identification in Single-Cell Mass Cytometry
title_sort gmm-based expanded feature space as a way to extract useful information for rare cell subtypes identification in single-cell mass cytometry
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10531342/
https://www.ncbi.nlm.nih.gov/pubmed/37762336
http://dx.doi.org/10.3390/ijms241814033
work_keys_str_mv AT suwalskaaleksandra gmmbasedexpandedfeaturespaceasawaytoextractusefulinformationforrarecellsubtypesidentificationinsinglecellmasscytometry
AT polanskajoanna gmmbasedexpandedfeaturespaceasawaytoextractusefulinformationforrarecellsubtypesidentificationinsinglecellmasscytometry