Cargando…

CytoGLMM: conditional differential analysis for flow and mass cytometry experiments

BACKGROUND: Flow and mass cytometry are important modern immunology tools for measuring expression levels of multiple proteins on single cells. The goal is to better understand the mechanisms of responses on a single cell basis by studying differential expression of proteins. Most current data analy...

Descripción completa

Detalles Bibliográficos
Autores principales: Seiler, Christof, Ferreira, Anne-Maud, Kronstad, Lisa M., Simpson, Laura J., Le Gars, Mathieu, Vendrame, Elena, Blish, Catherine A., Holmes, Susan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983283/
https://www.ncbi.nlm.nih.gov/pubmed/33752595
http://dx.doi.org/10.1186/s12859-021-04067-x
_version_ 1783667877284413440
author Seiler, Christof
Ferreira, Anne-Maud
Kronstad, Lisa M.
Simpson, Laura J.
Le Gars, Mathieu
Vendrame, Elena
Blish, Catherine A.
Holmes, Susan
author_facet Seiler, Christof
Ferreira, Anne-Maud
Kronstad, Lisa M.
Simpson, Laura J.
Le Gars, Mathieu
Vendrame, Elena
Blish, Catherine A.
Holmes, Susan
author_sort Seiler, Christof
collection PubMed
description BACKGROUND: Flow and mass cytometry are important modern immunology tools for measuring expression levels of multiple proteins on single cells. The goal is to better understand the mechanisms of responses on a single cell basis by studying differential expression of proteins. Most current data analysis tools compare expressions across many computationally discovered cell types. Our goal is to focus on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. RESULTS: Differential analysis of marker expressions can be difficult due to marker correlations and inter-subject heterogeneity, particularly for studies of human immunology. We address these challenges with two multiple regression strategies: a bootstrapped generalized linear model and a generalized linear mixed model. On simulated datasets, we compare the robustness towards marker correlations and heterogeneity of both strategies. For paired experiments, we find that both strategies maintain the target false discovery rate under medium correlations and that mixed models are statistically more powerful under the correct model specification. For unpaired experiments, our results indicate that much larger patient sample sizes are required to detect differences. We illustrate the CytoGLMM R package and workflow for both strategies on a pregnancy dataset. CONCLUSION: Our approach to finding differential proteins in flow and mass cytometry data reduces biases arising from marker correlations and safeguards against false discoveries induced by patient heterogeneity.
format Online
Article
Text
id pubmed-7983283
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79832832021-03-22 CytoGLMM: conditional differential analysis for flow and mass cytometry experiments Seiler, Christof Ferreira, Anne-Maud Kronstad, Lisa M. Simpson, Laura J. Le Gars, Mathieu Vendrame, Elena Blish, Catherine A. Holmes, Susan BMC Bioinformatics Research Article BACKGROUND: Flow and mass cytometry are important modern immunology tools for measuring expression levels of multiple proteins on single cells. The goal is to better understand the mechanisms of responses on a single cell basis by studying differential expression of proteins. Most current data analysis tools compare expressions across many computationally discovered cell types. Our goal is to focus on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. RESULTS: Differential analysis of marker expressions can be difficult due to marker correlations and inter-subject heterogeneity, particularly for studies of human immunology. We address these challenges with two multiple regression strategies: a bootstrapped generalized linear model and a generalized linear mixed model. On simulated datasets, we compare the robustness towards marker correlations and heterogeneity of both strategies. For paired experiments, we find that both strategies maintain the target false discovery rate under medium correlations and that mixed models are statistically more powerful under the correct model specification. For unpaired experiments, our results indicate that much larger patient sample sizes are required to detect differences. We illustrate the CytoGLMM R package and workflow for both strategies on a pregnancy dataset. CONCLUSION: Our approach to finding differential proteins in flow and mass cytometry data reduces biases arising from marker correlations and safeguards against false discoveries induced by patient heterogeneity. BioMed Central 2021-03-22 /pmc/articles/PMC7983283/ /pubmed/33752595 http://dx.doi.org/10.1186/s12859-021-04067-x Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Seiler, Christof
Ferreira, Anne-Maud
Kronstad, Lisa M.
Simpson, Laura J.
Le Gars, Mathieu
Vendrame, Elena
Blish, Catherine A.
Holmes, Susan
CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
title CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
title_full CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
title_fullStr CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
title_full_unstemmed CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
title_short CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
title_sort cytoglmm: conditional differential analysis for flow and mass cytometry experiments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983283/
https://www.ncbi.nlm.nih.gov/pubmed/33752595
http://dx.doi.org/10.1186/s12859-021-04067-x
work_keys_str_mv AT seilerchristof cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT ferreiraannemaud cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT kronstadlisam cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT simpsonlauraj cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT legarsmathieu cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT vendrameelena cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT blishcatherinea cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments
AT holmessusan cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments