Cargando…

Model-based clustering for flow and mass cytometry data with clinical information

BACKGROUND: High-dimensional flow cytometry and mass cytometry allow systemic-level characterization of more than 10 protein profiles at single-cell resolution and provide a much broader landscape in many biological applications, such as disease diagnosis and prediction of clinical outcome. When ass...

Descripción completa

Detalles Bibliográficos
Autores principales: Abe, Ko, Minoura, Kodai, Maeda, Yuka, Nishikawa, Hiroyoshi, Shimamura, Teppei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7495858/
https://www.ncbi.nlm.nih.gov/pubmed/32938365
http://dx.doi.org/10.1186/s12859-020-03671-7
_version_ 1783582975139512320
author Abe, Ko
Minoura, Kodai
Maeda, Yuka
Nishikawa, Hiroyoshi
Shimamura, Teppei
author_facet Abe, Ko
Minoura, Kodai
Maeda, Yuka
Nishikawa, Hiroyoshi
Shimamura, Teppei
author_sort Abe, Ko
collection PubMed
description BACKGROUND: High-dimensional flow cytometry and mass cytometry allow systemic-level characterization of more than 10 protein profiles at single-cell resolution and provide a much broader landscape in many biological applications, such as disease diagnosis and prediction of clinical outcome. When associating clinical information with cytometry data, traditional approaches require two distinct steps for identification of cell populations and statistical test to determine whether the difference between two population proportions is significant. These two-step approaches can lead to information loss and analysis bias. RESULTS: We propose a novel statistical framework, called LAMBDA (Latent Allocation Model with Bayesian Data Analysis), for simultaneous identification of unknown cell populations and discovery of associations between these populations and clinical information. LAMBDA uses specified probabilistic models designed for modeling the different distribution information for flow or mass cytometry data, respectively. We use a zero-inflated distribution for the mass cytometry data based the characteristics of the data. A simulation study confirms the usefulness of this model by evaluating the accuracy of the estimated parameters. We also demonstrate that LAMBDA can identify associations between cell populations and their clinical outcomes by analyzing real data. LAMBDA is implemented in R and is available from GitHub (https://github.com/abikoushi/lambda).
format Online
Article
Text
id pubmed-7495858
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74958582020-09-23 Model-based clustering for flow and mass cytometry data with clinical information Abe, Ko Minoura, Kodai Maeda, Yuka Nishikawa, Hiroyoshi Shimamura, Teppei BMC Bioinformatics Research BACKGROUND: High-dimensional flow cytometry and mass cytometry allow systemic-level characterization of more than 10 protein profiles at single-cell resolution and provide a much broader landscape in many biological applications, such as disease diagnosis and prediction of clinical outcome. When associating clinical information with cytometry data, traditional approaches require two distinct steps for identification of cell populations and statistical test to determine whether the difference between two population proportions is significant. These two-step approaches can lead to information loss and analysis bias. RESULTS: We propose a novel statistical framework, called LAMBDA (Latent Allocation Model with Bayesian Data Analysis), for simultaneous identification of unknown cell populations and discovery of associations between these populations and clinical information. LAMBDA uses specified probabilistic models designed for modeling the different distribution information for flow or mass cytometry data, respectively. We use a zero-inflated distribution for the mass cytometry data based the characteristics of the data. A simulation study confirms the usefulness of this model by evaluating the accuracy of the estimated parameters. We also demonstrate that LAMBDA can identify associations between cell populations and their clinical outcomes by analyzing real data. LAMBDA is implemented in R and is available from GitHub (https://github.com/abikoushi/lambda). BioMed Central 2020-09-17 /pmc/articles/PMC7495858/ /pubmed/32938365 http://dx.doi.org/10.1186/s12859-020-03671-7 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Abe, Ko
Minoura, Kodai
Maeda, Yuka
Nishikawa, Hiroyoshi
Shimamura, Teppei
Model-based clustering for flow and mass cytometry data with clinical information
title Model-based clustering for flow and mass cytometry data with clinical information
title_full Model-based clustering for flow and mass cytometry data with clinical information
title_fullStr Model-based clustering for flow and mass cytometry data with clinical information
title_full_unstemmed Model-based clustering for flow and mass cytometry data with clinical information
title_short Model-based clustering for flow and mass cytometry data with clinical information
title_sort model-based clustering for flow and mass cytometry data with clinical information
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7495858/
https://www.ncbi.nlm.nih.gov/pubmed/32938365
http://dx.doi.org/10.1186/s12859-020-03671-7
work_keys_str_mv AT abeko modelbasedclusteringforflowandmasscytometrydatawithclinicalinformation
AT minourakodai modelbasedclusteringforflowandmasscytometrydatawithclinicalinformation
AT maedayuka modelbasedclusteringforflowandmasscytometrydatawithclinicalinformation
AT nishikawahiroyoshi modelbasedclusteringforflowandmasscytometrydatawithclinicalinformation
AT shimamurateppei modelbasedclusteringforflowandmasscytometrydatawithclinicalinformation