Cargando…

Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization

Cytometry by time‐of‐flight (CyTOF) has emerged as a high‐throughput single cell technology able to provide large samples of protein readouts. Already, there exists a large pool of advanced high‐dimensional analysis algorithms that explore the observed heterogeneous distributions making intriguing b...

Descripción completa

Detalles Bibliográficos
Autores principales: Papoutsoglou, Georgios, Lagani, Vincenzo, Schmidt, Angelika, Tsirlis, Konstantinos, Cabrero, David‐Gómez, Tegnér, Jesper, Tsamardinos, Ioannis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7027760/
https://www.ncbi.nlm.nih.gov/pubmed/31692248
http://dx.doi.org/10.1002/cyto.a.23908
_version_ 1783498900825440256
author Papoutsoglou, Georgios
Lagani, Vincenzo
Schmidt, Angelika
Tsirlis, Konstantinos
Cabrero, David‐Gómez
Tegnér, Jesper
Tsamardinos, Ioannis
author_facet Papoutsoglou, Georgios
Lagani, Vincenzo
Schmidt, Angelika
Tsirlis, Konstantinos
Cabrero, David‐Gómez
Tegnér, Jesper
Tsamardinos, Ioannis
author_sort Papoutsoglou, Georgios
collection PubMed
description Cytometry by time‐of‐flight (CyTOF) has emerged as a high‐throughput single cell technology able to provide large samples of protein readouts. Already, there exists a large pool of advanced high‐dimensional analysis algorithms that explore the observed heterogeneous distributions making intriguing biological inferences. A fact largely overlooked by these methods, however, is the effect of the established data preprocessing pipeline to the distributions of the measured quantities. In this article, we focus on randomization, a transformation used for improving data visualization, which can negatively affect multivariate data analysis methods such as dimensionality reduction, clustering, and network reconstruction algorithms. Our results indicate that randomization should be used only for visualization purposes, but not in conjunction with high‐dimensional analytical tools. © 2019 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.
format Online
Article
Text
id pubmed-7027760
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher John Wiley & Sons, Inc.
record_format MEDLINE/PubMed
spelling pubmed-70277602020-02-24 Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization Papoutsoglou, Georgios Lagani, Vincenzo Schmidt, Angelika Tsirlis, Konstantinos Cabrero, David‐Gómez Tegnér, Jesper Tsamardinos, Ioannis Cytometry A Original Articles Cytometry by time‐of‐flight (CyTOF) has emerged as a high‐throughput single cell technology able to provide large samples of protein readouts. Already, there exists a large pool of advanced high‐dimensional analysis algorithms that explore the observed heterogeneous distributions making intriguing biological inferences. A fact largely overlooked by these methods, however, is the effect of the established data preprocessing pipeline to the distributions of the measured quantities. In this article, we focus on randomization, a transformation used for improving data visualization, which can negatively affect multivariate data analysis methods such as dimensionality reduction, clustering, and network reconstruction algorithms. Our results indicate that randomization should be used only for visualization purposes, but not in conjunction with high‐dimensional analytical tools. © 2019 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry. John Wiley & Sons, Inc. 2019-11-06 2019-11 /pmc/articles/PMC7027760/ /pubmed/31692248 http://dx.doi.org/10.1002/cyto.a.23908 Text en © 2019 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Original Articles
Papoutsoglou, Georgios
Lagani, Vincenzo
Schmidt, Angelika
Tsirlis, Konstantinos
Cabrero, David‐Gómez
Tegnér, Jesper
Tsamardinos, Ioannis
Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization
title Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization
title_full Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization
title_fullStr Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization
title_full_unstemmed Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization
title_short Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization
title_sort challenges in the multivariate analysis of mass cytometry data: the effect of randomization
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7027760/
https://www.ncbi.nlm.nih.gov/pubmed/31692248
http://dx.doi.org/10.1002/cyto.a.23908
work_keys_str_mv AT papoutsoglougeorgios challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization
AT laganivincenzo challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization
AT schmidtangelika challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization
AT tsirliskonstantinos challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization
AT cabrerodavidgomez challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization
AT tegnerjesper challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization
AT tsamardinosioannis challengesinthemultivariateanalysisofmasscytometrydatatheeffectofrandomization