Cargando…
Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates
BACKGROUND: Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For ex...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551517/ https://www.ncbi.nlm.nih.gov/pubmed/26316107 http://dx.doi.org/10.1186/s12859-015-0707-9 |
_version_ | 1782387578074300416 |
---|---|
author | Turner, Jacob A. Bolen, Christopher R. Blankenship, Derek M. |
author_facet | Turner, Jacob A. Bolen, Christopher R. Blankenship, Derek M. |
author_sort | Turner, Jacob A. |
collection | PubMed |
description | BACKGROUND: Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For example, the correlation of subjects is broken when comparing time points within a longitudinal study. Linear mixed models provide a method to analyze longitudinal studies as well as adjust for potential confounding factors and account for sources of variability that are not of primary interest. Currently, there are no known gene set analysis approaches that fully account for these study design and analysis aspects. In order to do so, we generalize the QuSAGE gene set analysis algorithm, denoted Q-Gen, and provide the necessary estimation adjustments to incorporate linear mixed model analyses. RESULTS: We assessed the performance of our generalized method in comparison to the original QuSAGE method in settings such as longitudinal repeated measures analysis and accounting for potential confounders. We demonstrate that the original QuSAGE method can not control for type-I error when these complexities exist. In addition to statistical appropriateness, analysis of a longitudinal influenza study suggests Q-Gen can allow for greater sensitivity when exploring a large number of gene sets. CONCLUSIONS: Q-Gen is an extension to the gene set analysis method of QuSAGE, and allows for linear mixed models to be applied appropriately within a gene set analysis framework. It provides GSA an added layer of flexibility that was not currently available. This flexibility allows for more appropriate statistical modeling of complex data structures that are inherent to many microarray study designs and can provide more sensitivity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0707-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4551517 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45515172015-08-29 Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates Turner, Jacob A. Bolen, Christopher R. Blankenship, Derek M. BMC Bioinformatics Methodology Article BACKGROUND: Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For example, the correlation of subjects is broken when comparing time points within a longitudinal study. Linear mixed models provide a method to analyze longitudinal studies as well as adjust for potential confounding factors and account for sources of variability that are not of primary interest. Currently, there are no known gene set analysis approaches that fully account for these study design and analysis aspects. In order to do so, we generalize the QuSAGE gene set analysis algorithm, denoted Q-Gen, and provide the necessary estimation adjustments to incorporate linear mixed model analyses. RESULTS: We assessed the performance of our generalized method in comparison to the original QuSAGE method in settings such as longitudinal repeated measures analysis and accounting for potential confounders. We demonstrate that the original QuSAGE method can not control for type-I error when these complexities exist. In addition to statistical appropriateness, analysis of a longitudinal influenza study suggests Q-Gen can allow for greater sensitivity when exploring a large number of gene sets. CONCLUSIONS: Q-Gen is an extension to the gene set analysis method of QuSAGE, and allows for linear mixed models to be applied appropriately within a gene set analysis framework. It provides GSA an added layer of flexibility that was not currently available. This flexibility allows for more appropriate statistical modeling of complex data structures that are inherent to many microarray study designs and can provide more sensitivity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0707-9) contains supplementary material, which is available to authorized users. BioMed Central 2015-08-28 /pmc/articles/PMC4551517/ /pubmed/26316107 http://dx.doi.org/10.1186/s12859-015-0707-9 Text en © Turner et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Turner, Jacob A. Bolen, Christopher R. Blankenship, Derek M. Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
title | Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
title_full | Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
title_fullStr | Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
title_full_unstemmed | Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
title_short | Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
title_sort | quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4551517/ https://www.ncbi.nlm.nih.gov/pubmed/26316107 http://dx.doi.org/10.1186/s12859-015-0707-9 |
work_keys_str_mv | AT turnerjacoba quantitativegenesetanalysisgeneralizedforrepeatedmeasuresconfounderadjustmentandcontinuouscovariates AT bolenchristopherr quantitativegenesetanalysisgeneralizedforrepeatedmeasuresconfounderadjustmentandcontinuouscovariates AT blankenshipderekm quantitativegenesetanalysisgeneralizedforrepeatedmeasuresconfounderadjustmentandcontinuouscovariates |