Cargando…

PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets

MOTIVATION: High-dimensional cytometry assays can simultaneously measure dozens of markers, enabling the investigation of complex phenotypes. However, as manual gating relies on previous biological knowledge, few marker combinations are often assessed. This results in complex phenotypes with the pot...

Descripción completa

Detalles Bibliográficos
Autores principales: Burke, Paulo E P, Strange, Ann, Monk, Emily, Thompson, Brian, Amato, Carol M, Woods, David M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710698/
https://www.ncbi.nlm.nih.gov/pubmed/36699375
http://dx.doi.org/10.1093/bioadv/vbac052
_version_ 1784841421354696704
author Burke, Paulo E P
Strange, Ann
Monk, Emily
Thompson, Brian
Amato, Carol M
Woods, David M
author_facet Burke, Paulo E P
Strange, Ann
Monk, Emily
Thompson, Brian
Amato, Carol M
Woods, David M
author_sort Burke, Paulo E P
collection PubMed
description MOTIVATION: High-dimensional cytometry assays can simultaneously measure dozens of markers, enabling the investigation of complex phenotypes. However, as manual gating relies on previous biological knowledge, few marker combinations are often assessed. This results in complex phenotypes with the potential for biological relevance being overlooked. Here, we present PhenoComb, an R package that allows agnostic exploration of phenotypes by assessing all combinations of markers. PhenoComb uses signal intensity thresholds to assign markers to discrete states (e.g. negative, low, high) and then counts the number of cells per sample from all possible marker combinations in a memory-safe manner. Time and disk space are the only constraints on the number of markers evaluated. PhenoComb also provides several approaches to perform statistical comparisons, evaluate the relevance of phenotypes and assess the independence of identified phenotypes. PhenoComb allows users to guide analysis by adjusting several function arguments, such as identifying parent populations of interest, filtering of low-frequency populations and defining a maximum complexity of phenotypes to evaluate. We have designed PhenoComb to be compatible with a local computer or server-based use. RESULTS: In testing of PhenoComb’s performance on synthetic datasets, computation on 16 markers was completed in the scale of minutes and up to 26 markers in hours. We applied PhenoComb to two publicly available datasets: an HIV flow cytometry dataset (12 markers and 421 samples) and the COVIDome CyTOF dataset (40 markers and 99 samples). In the HIV dataset, PhenoComb identified immune phenotypes associated with HIV seroconversion, including those highlighted in the original publication. In the COVID dataset, we identified several immune phenotypes with altered frequencies in infected individuals relative to healthy individuals. Collectively, PhenoComb represents a powerful discovery tool for agnostically assessing high-dimensional single-cell data. AVAILABILITY AND IMPLEMENTATION: The PhenoComb R package can be downloaded from https://github.com/SciOmicsLab/PhenoComb. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9710698
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97106982023-01-24 PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets Burke, Paulo E P Strange, Ann Monk, Emily Thompson, Brian Amato, Carol M Woods, David M Bioinform Adv Original Paper MOTIVATION: High-dimensional cytometry assays can simultaneously measure dozens of markers, enabling the investigation of complex phenotypes. However, as manual gating relies on previous biological knowledge, few marker combinations are often assessed. This results in complex phenotypes with the potential for biological relevance being overlooked. Here, we present PhenoComb, an R package that allows agnostic exploration of phenotypes by assessing all combinations of markers. PhenoComb uses signal intensity thresholds to assign markers to discrete states (e.g. negative, low, high) and then counts the number of cells per sample from all possible marker combinations in a memory-safe manner. Time and disk space are the only constraints on the number of markers evaluated. PhenoComb also provides several approaches to perform statistical comparisons, evaluate the relevance of phenotypes and assess the independence of identified phenotypes. PhenoComb allows users to guide analysis by adjusting several function arguments, such as identifying parent populations of interest, filtering of low-frequency populations and defining a maximum complexity of phenotypes to evaluate. We have designed PhenoComb to be compatible with a local computer or server-based use. RESULTS: In testing of PhenoComb’s performance on synthetic datasets, computation on 16 markers was completed in the scale of minutes and up to 26 markers in hours. We applied PhenoComb to two publicly available datasets: an HIV flow cytometry dataset (12 markers and 421 samples) and the COVIDome CyTOF dataset (40 markers and 99 samples). In the HIV dataset, PhenoComb identified immune phenotypes associated with HIV seroconversion, including those highlighted in the original publication. In the COVID dataset, we identified several immune phenotypes with altered frequencies in infected individuals relative to healthy individuals. Collectively, PhenoComb represents a powerful discovery tool for agnostically assessing high-dimensional single-cell data. AVAILABILITY AND IMPLEMENTATION: The PhenoComb R package can be downloaded from https://github.com/SciOmicsLab/PhenoComb. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-08-03 /pmc/articles/PMC9710698/ /pubmed/36699375 http://dx.doi.org/10.1093/bioadv/vbac052 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Burke, Paulo E P
Strange, Ann
Monk, Emily
Thompson, Brian
Amato, Carol M
Woods, David M
PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
title PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
title_full PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
title_fullStr PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
title_full_unstemmed PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
title_short PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
title_sort phenocomb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710698/
https://www.ncbi.nlm.nih.gov/pubmed/36699375
http://dx.doi.org/10.1093/bioadv/vbac052
work_keys_str_mv AT burkepauloep phenocombadiscoverytooltoassesscomplexphenotypesinhighdimensionalsinglecelldatasets
AT strangeann phenocombadiscoverytooltoassesscomplexphenotypesinhighdimensionalsinglecelldatasets
AT monkemily phenocombadiscoverytooltoassesscomplexphenotypesinhighdimensionalsinglecelldatasets
AT thompsonbrian phenocombadiscoverytooltoassesscomplexphenotypesinhighdimensionalsinglecelldatasets
AT amatocarolm phenocombadiscoverytooltoassesscomplexphenotypesinhighdimensionalsinglecelldatasets
AT woodsdavidm phenocombadiscoverytooltoassesscomplexphenotypesinhighdimensionalsinglecelldatasets