Cargando…

Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates

BACKGROUND: Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Turner, Jacob A., Bolen, Christopher R., Blankenship, Derek M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551517/
https://www.ncbi.nlm.nih.gov/pubmed/26316107
http://dx.doi.org/10.1186/s12859-015-0707-9
_version_ 1782387578074300416
author Turner, Jacob A.
Bolen, Christopher R.
Blankenship, Derek M.
author_facet Turner, Jacob A.
Bolen, Christopher R.
Blankenship, Derek M.
author_sort Turner, Jacob A.
collection PubMed
description BACKGROUND: Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For example, the correlation of subjects is broken when comparing time points within a longitudinal study. Linear mixed models provide a method to analyze longitudinal studies as well as adjust for potential confounding factors and account for sources of variability that are not of primary interest. Currently, there are no known gene set analysis approaches that fully account for these study design and analysis aspects. In order to do so, we generalize the QuSAGE gene set analysis algorithm, denoted Q-Gen, and provide the necessary estimation adjustments to incorporate linear mixed model analyses. RESULTS: We assessed the performance of our generalized method in comparison to the original QuSAGE method in settings such as longitudinal repeated measures analysis and accounting for potential confounders. We demonstrate that the original QuSAGE method can not control for type-I error when these complexities exist. In addition to statistical appropriateness, analysis of a longitudinal influenza study suggests Q-Gen can allow for greater sensitivity when exploring a large number of gene sets. CONCLUSIONS: Q-Gen is an extension to the gene set analysis method of QuSAGE, and allows for linear mixed models to be applied appropriately within a gene set analysis framework. It provides GSA an added layer of flexibility that was not currently available. This flexibility allows for more appropriate statistical modeling of complex data structures that are inherent to many microarray study designs and can provide more sensitivity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0707-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4551517
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45515172015-08-29 Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates Turner, Jacob A. Bolen, Christopher R. Blankenship, Derek M. BMC Bioinformatics Methodology Article BACKGROUND: Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For example, the correlation of subjects is broken when comparing time points within a longitudinal study. Linear mixed models provide a method to analyze longitudinal studies as well as adjust for potential confounding factors and account for sources of variability that are not of primary interest. Currently, there are no known gene set analysis approaches that fully account for these study design and analysis aspects. In order to do so, we generalize the QuSAGE gene set analysis algorithm, denoted Q-Gen, and provide the necessary estimation adjustments to incorporate linear mixed model analyses. RESULTS: We assessed the performance of our generalized method in comparison to the original QuSAGE method in settings such as longitudinal repeated measures analysis and accounting for potential confounders. We demonstrate that the original QuSAGE method can not control for type-I error when these complexities exist. In addition to statistical appropriateness, analysis of a longitudinal influenza study suggests Q-Gen can allow for greater sensitivity when exploring a large number of gene sets. CONCLUSIONS: Q-Gen is an extension to the gene set analysis method of QuSAGE, and allows for linear mixed models to be applied appropriately within a gene set analysis framework. It provides GSA an added layer of flexibility that was not currently available. This flexibility allows for more appropriate statistical modeling of complex data structures that are inherent to many microarray study designs and can provide more sensitivity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0707-9) contains supplementary material, which is available to authorized users. BioMed Central 2015-08-28 /pmc/articles/PMC4551517/ /pubmed/26316107 http://dx.doi.org/10.1186/s12859-015-0707-9 Text en © Turner et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Turner, Jacob A.
Bolen, Christopher R.
Blankenship, Derek M.
Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
title Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
title_full Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
title_fullStr Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
title_full_unstemmed Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
title_short Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
title_sort quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551517/
https://www.ncbi.nlm.nih.gov/pubmed/26316107
http://dx.doi.org/10.1186/s12859-015-0707-9
work_keys_str_mv AT turnerjacoba quantitativegenesetanalysisgeneralizedforrepeatedmeasuresconfounderadjustmentandcontinuouscovariates
AT bolenchristopherr quantitativegenesetanalysisgeneralizedforrepeatedmeasuresconfounderadjustmentandcontinuouscovariates
AT blankenshipderekm quantitativegenesetanalysisgeneralizedforrepeatedmeasuresconfounderadjustmentandcontinuouscovariates