Cargando…

Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells

BACKGROUND: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually dev...

Descripción completa

Detalles Bibliográficos
Autores principales: Chiu, Yen-Jung, Hsieh, Yi-Hsuan, Huang, Yen-Hua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923925/
https://www.ncbi.nlm.nih.gov/pubmed/31856824
http://dx.doi.org/10.1186/s12920-019-0613-5
_version_ 1783481625052446720
author Chiu, Yen-Jung
Hsieh, Yi-Hsuan
Huang, Yen-Hua
author_facet Chiu, Yen-Jung
Hsieh, Yi-Hsuan
Huang, Yen-Hua
author_sort Chiu, Yen-Jung
collection PubMed
description BACKGROUND: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually developed along with a set of reference gene expression profiles consisting of imbalanced replicates across different cell types. Therefore, the objective of this study was to create a new deconvolution method equipped with a new set of reference gene expression profiles that incorporate more microarray replicates of the immune cells that have been frequently implicated in the poor prognosis of cancers, such as T helper cells, regulatory T cells and macrophage M1/M2 cells. METHODS: Our deconvolution method was developed by choosing ε-support vector regression (ε-SVR) as the core algorithm assigned with a loss function subject to the L1-norm penalty. To construct the reference gene expression signature matrix for regression, a subset of differentially expressed genes were chosen from 148 microarray-based gene expression profiles for 9 types of immune cells by using ANOVA and minimizing condition number. Agreement analyses including mean absolute percentage errors and Bland-Altman plots were carried out to compare the performances of our method and CIBERSORT. RESULTS: In silico cell mixtures, simulated bulk tissues, and real human samples with known immune-cell fractions were used as the test datasets for benchmarking. Our method outperformed CIBERSORT in the benchmarks using in silico breast tissue-immune cell mixtures in the proportions of 30:70 and 50:50, and in the benchmark using 164 human PBMC samples. Our results suggest that the performance of our method was at least comparable to that of a state-of-the-art tool, CIBERSORT. CONCLUSIONS: We developed a new cell composition deconvolution method and the implementation was entirely based on the publicly available R and Python packages. In addition, we compiled a new set of reference gene expression profiles, which might allow for a more robust prediction of the immune cell fractions from the expression profiles of cell mixtures. The source code of our method could be downloaded from https://github.com/holiday01/deconvolution-to-estimate-immune-cell-subsets.
format Online
Article
Text
id pubmed-6923925
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69239252019-12-30 Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells Chiu, Yen-Jung Hsieh, Yi-Hsuan Huang, Yen-Hua BMC Med Genomics Research BACKGROUND: To facilitate the investigation of the pathogenic roles played by various immune cells in complex tissues such as tumors, a few computational methods for deconvoluting bulk gene expression profiles to predict cell composition have been created. However, available methods were usually developed along with a set of reference gene expression profiles consisting of imbalanced replicates across different cell types. Therefore, the objective of this study was to create a new deconvolution method equipped with a new set of reference gene expression profiles that incorporate more microarray replicates of the immune cells that have been frequently implicated in the poor prognosis of cancers, such as T helper cells, regulatory T cells and macrophage M1/M2 cells. METHODS: Our deconvolution method was developed by choosing ε-support vector regression (ε-SVR) as the core algorithm assigned with a loss function subject to the L1-norm penalty. To construct the reference gene expression signature matrix for regression, a subset of differentially expressed genes were chosen from 148 microarray-based gene expression profiles for 9 types of immune cells by using ANOVA and minimizing condition number. Agreement analyses including mean absolute percentage errors and Bland-Altman plots were carried out to compare the performances of our method and CIBERSORT. RESULTS: In silico cell mixtures, simulated bulk tissues, and real human samples with known immune-cell fractions were used as the test datasets for benchmarking. Our method outperformed CIBERSORT in the benchmarks using in silico breast tissue-immune cell mixtures in the proportions of 30:70 and 50:50, and in the benchmark using 164 human PBMC samples. Our results suggest that the performance of our method was at least comparable to that of a state-of-the-art tool, CIBERSORT. CONCLUSIONS: We developed a new cell composition deconvolution method and the implementation was entirely based on the publicly available R and Python packages. In addition, we compiled a new set of reference gene expression profiles, which might allow for a more robust prediction of the immune cell fractions from the expression profiles of cell mixtures. The source code of our method could be downloaded from https://github.com/holiday01/deconvolution-to-estimate-immune-cell-subsets. BioMed Central 2019-12-20 /pmc/articles/PMC6923925/ /pubmed/31856824 http://dx.doi.org/10.1186/s12920-019-0613-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chiu, Yen-Jung
Hsieh, Yi-Hsuan
Huang, Yen-Hua
Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
title Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
title_full Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
title_fullStr Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
title_full_unstemmed Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
title_short Improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
title_sort improved cell composition deconvolution method of bulk gene expression profiles to quantify subsets of immune cells
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923925/
https://www.ncbi.nlm.nih.gov/pubmed/31856824
http://dx.doi.org/10.1186/s12920-019-0613-5
work_keys_str_mv AT chiuyenjung improvedcellcompositiondeconvolutionmethodofbulkgeneexpressionprofilestoquantifysubsetsofimmunecells
AT hsiehyihsuan improvedcellcompositiondeconvolutionmethodofbulkgeneexpressionprofilestoquantifysubsetsofimmunecells
AT huangyenhua improvedcellcompositiondeconvolutionmethodofbulkgeneexpressionprofilestoquantifysubsetsofimmunecells