Cargando…

Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples

BACKGROUND: Towards discovering robust cancer biomarkers, it is imperative to unravel the cellular heterogeneity of patient samples and comprehend the interactions between cancer cells and the various cell types in the tumor microenvironment. The first generation of ‘partial’ computational deconvolu...

Descripción completa

Detalles Bibliográficos
Autores principales: Dimitrakopoulou, Konstantina, Wik, Elisabeth, Akslen, Lars A., Jonassen, Inge
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6223087/
https://www.ncbi.nlm.nih.gov/pubmed/30404611
http://dx.doi.org/10.1186/s12859-018-2442-5
_version_ 1783369356349014016
author Dimitrakopoulou, Konstantina
Wik, Elisabeth
Akslen, Lars A.
Jonassen, Inge
author_facet Dimitrakopoulou, Konstantina
Wik, Elisabeth
Akslen, Lars A.
Jonassen, Inge
author_sort Dimitrakopoulou, Konstantina
collection PubMed
description BACKGROUND: Towards discovering robust cancer biomarkers, it is imperative to unravel the cellular heterogeneity of patient samples and comprehend the interactions between cancer cells and the various cell types in the tumor microenvironment. The first generation of ‘partial’ computational deconvolution methods required prior information either on the cell/tissue type proportions or the cell/tissue type-specific expression signatures and the number of involved cell/tissue types. The second generation of ‘complete’ approaches allowed estimating both of the cell/tissue type proportions and cell/tissue type-specific expression profiles directly from the mixed gene expression data, based on known (or automatically identified) cell/tissue type-specific marker genes. RESULTS: We present Deblender, a flexible complete deconvolution tool operating in semi−/unsupervised mode based on the user’s access to known marker gene lists and information about cell/tissue composition. In case of no prior knowledge, global gene expression variability is used in clustering the mixed data to substitute marker sets with cluster sets. In addition, we integrate a model selection criterion to predict the number of constituent cell/tissue types. Moreover, we provide a tailored algorithmic scheme to estimate mixture proportions for realistic experimental cases where the number of involved cell/tissue types exceeds the number of mixed samples. We assess the performance of Deblender and a set of state-of-the-art existing tools on a comprehensive set of benchmark and patient cancer mixture expression datasets (including TCGA). CONCLUSION: Our results corroborate that Deblender can be a valuable tool to improve understanding of gene expression datasets with implications for prediction and clinical utilization. Deblender is implemented in MATLAB and is available from (https://github.com/kondim1983/Deblender/). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2442-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6223087
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62230872018-11-19 Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples Dimitrakopoulou, Konstantina Wik, Elisabeth Akslen, Lars A. Jonassen, Inge BMC Bioinformatics Research Article BACKGROUND: Towards discovering robust cancer biomarkers, it is imperative to unravel the cellular heterogeneity of patient samples and comprehend the interactions between cancer cells and the various cell types in the tumor microenvironment. The first generation of ‘partial’ computational deconvolution methods required prior information either on the cell/tissue type proportions or the cell/tissue type-specific expression signatures and the number of involved cell/tissue types. The second generation of ‘complete’ approaches allowed estimating both of the cell/tissue type proportions and cell/tissue type-specific expression profiles directly from the mixed gene expression data, based on known (or automatically identified) cell/tissue type-specific marker genes. RESULTS: We present Deblender, a flexible complete deconvolution tool operating in semi−/unsupervised mode based on the user’s access to known marker gene lists and information about cell/tissue composition. In case of no prior knowledge, global gene expression variability is used in clustering the mixed data to substitute marker sets with cluster sets. In addition, we integrate a model selection criterion to predict the number of constituent cell/tissue types. Moreover, we provide a tailored algorithmic scheme to estimate mixture proportions for realistic experimental cases where the number of involved cell/tissue types exceeds the number of mixed samples. We assess the performance of Deblender and a set of state-of-the-art existing tools on a comprehensive set of benchmark and patient cancer mixture expression datasets (including TCGA). CONCLUSION: Our results corroborate that Deblender can be a valuable tool to improve understanding of gene expression datasets with implications for prediction and clinical utilization. Deblender is implemented in MATLAB and is available from (https://github.com/kondim1983/Deblender/). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2442-5) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-07 /pmc/articles/PMC6223087/ /pubmed/30404611 http://dx.doi.org/10.1186/s12859-018-2442-5 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Dimitrakopoulou, Konstantina
Wik, Elisabeth
Akslen, Lars A.
Jonassen, Inge
Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
title Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
title_full Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
title_fullStr Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
title_full_unstemmed Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
title_short Deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
title_sort deblender: a semi−/unsupervised multi-operational computational method for complete deconvolution of expression data from heterogeneous samples
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6223087/
https://www.ncbi.nlm.nih.gov/pubmed/30404611
http://dx.doi.org/10.1186/s12859-018-2442-5
work_keys_str_mv AT dimitrakopouloukonstantina deblenderasemiunsupervisedmultioperationalcomputationalmethodforcompletedeconvolutionofexpressiondatafromheterogeneoussamples
AT wikelisabeth deblenderasemiunsupervisedmultioperationalcomputationalmethodforcompletedeconvolutionofexpressiondatafromheterogeneoussamples
AT akslenlarsa deblenderasemiunsupervisedmultioperationalcomputationalmethodforcompletedeconvolutionofexpressiondatafromheterogeneoussamples
AT jonasseninge deblenderasemiunsupervisedmultioperationalcomputationalmethodforcompletedeconvolutionofexpressiondatafromheterogeneoussamples