Cargando…

Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold

BACKGROUND: The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of dif...

Descripción completa

Detalles Bibliográficos
Autores principales: Glass, Edmund R., Dozmorov, Mikhail G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073979/
https://www.ncbi.nlm.nih.gov/pubmed/27766949
http://dx.doi.org/10.1186/s12859-016-1226-z
_version_ 1782461670868647936
author Glass, Edmund R.
Dozmorov, Mikhail G.
author_facet Glass, Edmund R.
Dozmorov, Mikhail G.
author_sort Glass, Edmund R.
collection PubMed
description BACKGROUND: The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. Combined with cell counts, heterogeneous gene expression may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression, and a global cutoff to judge significance, such as False Discovery Rate (FDR). Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect linear regression. In this paper we quantify the parameter space affecting the performance of linear regression (sensitivity of cell type-specific differential expression detection) on a per-gene basis. RESULTS: We evaluated the effect of sample sizes, cell type-specific proportion variability, and mean squared error on sensitivity of cell type-specific differential expression detection using linear regression. Each parameter affected variability of cell type-specific expression estimates and, subsequently, the sensitivity of differential expression detection. We provide the R package, LRCDE, which performs linear regression-based cell type-specific differential expression (deconvolution) detection on a gene-by-gene basis. Accounting for variability around cell type-specific gene expression estimates, it computes per-gene t-statistics of differential detection, p-values, t-statistic-based sensitivity, group-specific mean squared error, and several gene-specific diagnostic metrics. CONCLUSIONS: The sensitivity of linear regression-based cell type-specific differential expression detection differed for each gene as a function of mean squared error, per group sample sizes, and variability of the proportions of target cell (cell type being analyzed). We demonstrate that LRCDE, which uses Welch’s t-test to compare per-gene cell type-specific gene expression estimates, is more sensitive in detecting cell type-specific differential expression at α < 0.05 missed by the global false discovery rate threshold FDR < 0.3. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1226-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5073979
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50739792016-10-27 Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold Glass, Edmund R. Dozmorov, Mikhail G. BMC Bioinformatics Proceedings BACKGROUND: The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. Combined with cell counts, heterogeneous gene expression may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression, and a global cutoff to judge significance, such as False Discovery Rate (FDR). Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect linear regression. In this paper we quantify the parameter space affecting the performance of linear regression (sensitivity of cell type-specific differential expression detection) on a per-gene basis. RESULTS: We evaluated the effect of sample sizes, cell type-specific proportion variability, and mean squared error on sensitivity of cell type-specific differential expression detection using linear regression. Each parameter affected variability of cell type-specific expression estimates and, subsequently, the sensitivity of differential expression detection. We provide the R package, LRCDE, which performs linear regression-based cell type-specific differential expression (deconvolution) detection on a gene-by-gene basis. Accounting for variability around cell type-specific gene expression estimates, it computes per-gene t-statistics of differential detection, p-values, t-statistic-based sensitivity, group-specific mean squared error, and several gene-specific diagnostic metrics. CONCLUSIONS: The sensitivity of linear regression-based cell type-specific differential expression detection differed for each gene as a function of mean squared error, per group sample sizes, and variability of the proportions of target cell (cell type being analyzed). We demonstrate that LRCDE, which uses Welch’s t-test to compare per-gene cell type-specific gene expression estimates, is more sensitive in detecting cell type-specific differential expression at α < 0.05 missed by the global false discovery rate threshold FDR < 0.3. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1226-z) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-06 /pmc/articles/PMC5073979/ /pubmed/27766949 http://dx.doi.org/10.1186/s12859-016-1226-z Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Glass, Edmund R.
Dozmorov, Mikhail G.
Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
title Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
title_full Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
title_fullStr Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
title_full_unstemmed Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
title_short Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
title_sort improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073979/
https://www.ncbi.nlm.nih.gov/pubmed/27766949
http://dx.doi.org/10.1186/s12859-016-1226-z
work_keys_str_mv AT glassedmundr improvingsensitivityoflinearregressionbasedcelltypespecificdifferentialexpressiondeconvolutionwithpergenevsglobalsignificancethreshold
AT dozmorovmikhailg improvingsensitivityoflinearregressionbasedcelltypespecificdifferentialexpressiondeconvolutionwithpergenevsglobalsignificancethreshold