Cargando…

Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach

BACKGROUND: For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence act...

Descripción completa

Detalles Bibliográficos
Autores principales: Repsilber, Dirk, Kern, Sabine, Telaar, Anna, Walzl, Gerhard, Black, Gillian F, Selbig, Joachim, Parida, Shreemanta K, Kaufmann, Stefan HE, Jacobsen, Marc
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098067/
https://www.ncbi.nlm.nih.gov/pubmed/20070912
http://dx.doi.org/10.1186/1471-2105-11-27
_version_ 1782203909752750080
author Repsilber, Dirk
Kern, Sabine
Telaar, Anna
Walzl, Gerhard
Black, Gillian F
Selbig, Joachim
Parida, Shreemanta K
Kaufmann, Stefan HE
Jacobsen, Marc
author_facet Repsilber, Dirk
Kern, Sabine
Telaar, Anna
Walzl, Gerhard
Black, Gillian F
Selbig, Joachim
Parida, Shreemanta K
Kaufmann, Stefan HE
Jacobsen, Marc
author_sort Repsilber, Dirk
collection PubMed
description BACKGROUND: For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or in-silico deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues. RESULTS: Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach. Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available. CONCLUSIONS: The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes.
format Text
id pubmed-3098067
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30980672011-05-20 Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach Repsilber, Dirk Kern, Sabine Telaar, Anna Walzl, Gerhard Black, Gillian F Selbig, Joachim Parida, Shreemanta K Kaufmann, Stefan HE Jacobsen, Marc BMC Bioinformatics Research Article BACKGROUND: For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or in-silico deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues. RESULTS: Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach. Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available. CONCLUSIONS: The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes. BioMed Central 2010-01-14 /pmc/articles/PMC3098067/ /pubmed/20070912 http://dx.doi.org/10.1186/1471-2105-11-27 Text en Copyright © 2010 Repsilber et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Repsilber, Dirk
Kern, Sabine
Telaar, Anna
Walzl, Gerhard
Black, Gillian F
Selbig, Joachim
Parida, Shreemanta K
Kaufmann, Stefan HE
Jacobsen, Marc
Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
title Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
title_full Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
title_fullStr Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
title_full_unstemmed Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
title_short Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
title_sort biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098067/
https://www.ncbi.nlm.nih.gov/pubmed/20070912
http://dx.doi.org/10.1186/1471-2105-11-27
work_keys_str_mv AT repsilberdirk biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT kernsabine biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT telaaranna biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT walzlgerhard biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT blackgillianf biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT selbigjoachim biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT paridashreemantak biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT kaufmannstefanhe biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach
AT jacobsenmarc biomarkerdiscoveryinheterogeneoustissuesamplestakingtheinsilicodeconfoundingapproach