Cargando…
Compositional Data Analysis using Kernels in mass cytometry data
MOTIVATION: Cell-type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations fo...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8867823/ https://www.ncbi.nlm.nih.gov/pubmed/35224501 http://dx.doi.org/10.1093/bioadv/vbac003 |
_version_ | 1784656133770706944 |
---|---|
author | Rudra, Pratyaydipta Baxter, Ryan Hsieh, Elena W Y Ghosh, Debashis |
author_facet | Rudra, Pratyaydipta Baxter, Ryan Hsieh, Elena W Y Ghosh, Debashis |
author_sort | Rudra, Pratyaydipta |
collection | PubMed |
description | MOTIVATION: Cell-type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations for high-dimensional mass cytometry data, especially when the sample size is small. RESULTS: We proposed a new multivariate statistical learning methodology, Compositional Data Analysis using Kernels (CODAK), based on the kernel distance covariance (KDC) framework to test the association of the cell type compositions with important predictors (categorical or continuous) such as disease status. CODAK scales well for high-dimensional data and provides satisfactory performance for small sample sizes (n < 25). We conducted simulation studies to compare the performance of the method with existing methods of analyzing cell type abundance data from mass cytometry studies. The method is also applied to a high-dimensional dataset containing different subgroups of populations including Systemic Lupus Erythematosus (SLE) patients and healthy control subjects. AVAILABILITY AND IMPLEMENTATION: CODAK is implemented using R. The codes and the data used in this manuscript are available on the web at http://github.com/GhoshLab/CODAK/. CONTACT: prudra@okstate.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-8867823 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-88678232022-02-25 Compositional Data Analysis using Kernels in mass cytometry data Rudra, Pratyaydipta Baxter, Ryan Hsieh, Elena W Y Ghosh, Debashis Bioinform Adv Original Article MOTIVATION: Cell-type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations for high-dimensional mass cytometry data, especially when the sample size is small. RESULTS: We proposed a new multivariate statistical learning methodology, Compositional Data Analysis using Kernels (CODAK), based on the kernel distance covariance (KDC) framework to test the association of the cell type compositions with important predictors (categorical or continuous) such as disease status. CODAK scales well for high-dimensional data and provides satisfactory performance for small sample sizes (n < 25). We conducted simulation studies to compare the performance of the method with existing methods of analyzing cell type abundance data from mass cytometry studies. The method is also applied to a high-dimensional dataset containing different subgroups of populations including Systemic Lupus Erythematosus (SLE) patients and healthy control subjects. AVAILABILITY AND IMPLEMENTATION: CODAK is implemented using R. The codes and the data used in this manuscript are available on the web at http://github.com/GhoshLab/CODAK/. CONTACT: prudra@okstate.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-02-11 /pmc/articles/PMC8867823/ /pubmed/35224501 http://dx.doi.org/10.1093/bioadv/vbac003 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Rudra, Pratyaydipta Baxter, Ryan Hsieh, Elena W Y Ghosh, Debashis Compositional Data Analysis using Kernels in mass cytometry data |
title | Compositional Data Analysis using Kernels in mass cytometry data |
title_full | Compositional Data Analysis using Kernels in mass cytometry data |
title_fullStr | Compositional Data Analysis using Kernels in mass cytometry data |
title_full_unstemmed | Compositional Data Analysis using Kernels in mass cytometry data |
title_short | Compositional Data Analysis using Kernels in mass cytometry data |
title_sort | compositional data analysis using kernels in mass cytometry data |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8867823/ https://www.ncbi.nlm.nih.gov/pubmed/35224501 http://dx.doi.org/10.1093/bioadv/vbac003 |
work_keys_str_mv | AT rudrapratyaydipta compositionaldataanalysisusingkernelsinmasscytometrydata AT baxterryan compositionaldataanalysisusingkernelsinmasscytometrydata AT hsiehelenawy compositionaldataanalysisusingkernelsinmasscytometrydata AT ghoshdebashis compositionaldataanalysisusingkernelsinmasscytometrydata |