Cargando…
Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
BACKGROUND: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually dev...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923925/ https://www.ncbi.nlm.nih.gov/pubmed/31856824 http://dx.doi.org/10.1186/s12920-019-0613-5 |
_version_ | 1783481625052446720 |
---|---|
author | Chiu, Yen-Jung Hsieh, Yi-Hsuan Huang, Yen-Hua |
author_facet | Chiu, Yen-Jung Hsieh, Yi-Hsuan Huang, Yen-Hua |
author_sort | Chiu, Yen-Jung |
collection | PubMed |
description | BACKGROUND: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually developed along with a set of reference gene expression profiles consisting of imbalanced replicates across different cell types. Therefore, the objective of this study was to create a new deconvolution method equipped with a new set of reference gene expression profiles that incorporate more microarray replicates of the immune cells that have been frequently implicated in the poor prognosis of cancers, such as T helper cells, regulatory T cells and macrophage M1/M2 cells. METHODS: Our deconvolution method was developed by choosing ε-support vector regression (ε-SVR) as the core algorithm assigned with a loss function subject to the L1-norm penalty. To construct the reference gene expression signature matrix for regression, a subset of differentially expressed genes were chosen from 148 microarray-based gene expression profiles for 9 types of immune cells by using ANOVA and minimizing condition number. Agreement analyses including mean absolute percentage errors and Bland-Altman plots were carried out to compare the performances of our method and CIBERSORT. RESULTS: In silico cell mixtures, simulated bulk tissues, and real human samples with known immune-cell fractions were used as the test datasets for benchmarking. Our method outperformed CIBERSORT in the benchmarks using in silico breast tissue-immune cell mixtures in the proportions of 30:70 and 50:50, and in the benchmark using 164 human PBMC samples. Our results suggest that the performance of our method was at least comparable to that of a state-of-the-art tool, CIBERSORT. CONCLUSIONS: We developed a new cell composition deconvolution method and the implementation was entirely based on the publicly available R and Python packages. In addition, we compiled a new set of reference gene expression profiles, which might allow for a more robust prediction of the immune cell fractions from the expression profiles of cell mixtures. The source code of our method could be downloaded from https://github.com/holiday01/deconvolution-to-estimate-immune-cell-subsets. |
format | Online Article Text |
id | pubmed-6923925 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69239252019-12-30 Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells Chiu, Yen-Jung Hsieh, Yi-Hsuan Huang, Yen-Hua BMC Med Genomics Research BACKGROUND: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually developed along with a set of reference gene expression profiles consisting of imbalanced replicates across different cell types. Therefore, the objective of this study was to create a new deconvolution method equipped with a new set of reference gene expression profiles that incorporate more microarray replicates of the immune cells that have been frequently implicated in the poor prognosis of cancers, such as T helper cells, regulatory T cells and macrophage M1/M2 cells. METHODS: Our deconvolution method was developed by choosing ε-support vector regression (ε-SVR) as the core algorithm assigned with a loss function subject to the L1-norm penalty. To construct the reference gene expression signature matrix for regression, a subset of differentially expressed genes were chosen from 148 microarray-based gene expression profiles for 9 types of immune cells by using ANOVA and minimizing condition number. Agreement analyses including mean absolute percentage errors and Bland-Altman plots were carried out to compare the performances of our method and CIBERSORT. RESULTS: In silico cell mixtures, simulated bulk tissues, and real human samples with known immune-cell fractions were used as the test datasets for benchmarking. Our method outperformed CIBERSORT in the benchmarks using in silico breast tissue-immune cell mixtures in the proportions of 30:70 and 50:50, and in the benchmark using 164 human PBMC samples. Our results suggest that the performance of our method was at least comparable to that of a state-of-the-art tool, CIBERSORT. CONCLUSIONS: We developed a new cell composition deconvolution method and the implementation was entirely based on the publicly available R and Python packages. In addition, we compiled a new set of reference gene expression profiles, which might allow for a more robust prediction of the immune cell fractions from the expression profiles of cell mixtures. The source code of our method could be downloaded from https://github.com/holiday01/deconvolution-to-estimate-immune-cell-subsets. BioMed Central 2019-12-20 /pmc/articles/PMC6923925/ /pubmed/31856824 http://dx.doi.org/10.1186/s12920-019-0613-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Chiu, Yen-Jung Hsieh, Yi-Hsuan Huang, Yen-Hua Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
title | Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
title_full | Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
title_fullStr | Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
title_full_unstemmed | Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
title_short | Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
title_sort | improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923925/ https://www.ncbi.nlm.nih.gov/pubmed/31856824 http://dx.doi.org/10.1186/s12920-019-0613-5 |
work_keys_str_mv | AT chiuyenjung improvedcellcompositiondeconvolutionmethodofbulkgeneexpressionprofilestoquantifysubsetsofimmunecells AT hsiehyihsuan improvedcellcompositiondeconvolutionmethodofbulkgeneexpressionprofilestoquantifysubsetsofimmunecells AT huangyenhua improvedcellcompositiondeconvolutionmethodofbulkgeneexpressionprofilestoquantifysubsetsofimmunecells |