Cargando…

OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis

Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, desp...

Descripción completa

Detalles Bibliográficos
Autores principales: Finak, Greg, Frelinger, Jacob, Jiang, Wenxin, Newell, Evan W., Ramey, John, Davis, Mark M., Kalams, Spyros A., De Rosa, Stephen C., Gottardo, Raphael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4148203/
https://www.ncbi.nlm.nih.gov/pubmed/25167361
http://dx.doi.org/10.1371/journal.pcbi.1003806
_version_ 1782332576130662400
author Finak, Greg
Frelinger, Jacob
Jiang, Wenxin
Newell, Evan W.
Ramey, John
Davis, Mark M.
Kalams, Spyros A.
De Rosa, Stephen C.
Gottardo, Raphael
author_facet Finak, Greg
Frelinger, Jacob
Jiang, Wenxin
Newell, Evan W.
Ramey, John
Davis, Mark M.
Kalams, Spyros A.
De Rosa, Stephen C.
Gottardo, Raphael
author_sort Finak, Greg
collection PubMed
description Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment.
format Online
Article
Text
id pubmed-4148203
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41482032014-08-29 OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis Finak, Greg Frelinger, Jacob Jiang, Wenxin Newell, Evan W. Ramey, John Davis, Mark M. Kalams, Spyros A. De Rosa, Stephen C. Gottardo, Raphael PLoS Comput Biol Research Article Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment. Public Library of Science 2014-08-28 /pmc/articles/PMC4148203/ /pubmed/25167361 http://dx.doi.org/10.1371/journal.pcbi.1003806 Text en © 2014 Finak et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Finak, Greg
Frelinger, Jacob
Jiang, Wenxin
Newell, Evan W.
Ramey, John
Davis, Mark M.
Kalams, Spyros A.
De Rosa, Stephen C.
Gottardo, Raphael
OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
title OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
title_full OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
title_fullStr OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
title_full_unstemmed OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
title_short OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
title_sort opencyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4148203/
https://www.ncbi.nlm.nih.gov/pubmed/25167361
http://dx.doi.org/10.1371/journal.pcbi.1003806
work_keys_str_mv AT finakgreg opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT frelingerjacob opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT jiangwenxin opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT newellevanw opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT rameyjohn opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT davismarkm opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT kalamsspyrosa opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT derosastephenc opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT gottardoraphael opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis