Cargando…

Gene set analysis using variance component tests

BACKGROUND: Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However,...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yen-Tsung, Lin, Xihong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3776447/
https://www.ncbi.nlm.nih.gov/pubmed/23806107
http://dx.doi.org/10.1186/1471-2105-14-210
_version_ 1782477485997293568
author Huang, Yen-Tsung
Lin, Xihong
author_facet Huang, Yen-Tsung
Lin, Xihong
author_sort Huang, Yen-Tsung
collection PubMed
description BACKGROUND: Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. RESULTS: We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). CONCLUSION: We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
format Online
Article
Text
id pubmed-3776447
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37764472013-09-19 Gene set analysis using variance component tests Huang, Yen-Tsung Lin, Xihong BMC Bioinformatics Methodology Article BACKGROUND: Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. RESULTS: We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). CONCLUSION: We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data. BioMed Central 2013-06-28 /pmc/articles/PMC3776447/ /pubmed/23806107 http://dx.doi.org/10.1186/1471-2105-14-210 Text en Copyright © 2013 Huang and Lin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Huang, Yen-Tsung
Lin, Xihong
Gene set analysis using variance component tests
title Gene set analysis using variance component tests
title_full Gene set analysis using variance component tests
title_fullStr Gene set analysis using variance component tests
title_full_unstemmed Gene set analysis using variance component tests
title_short Gene set analysis using variance component tests
title_sort gene set analysis using variance component tests
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3776447/
https://www.ncbi.nlm.nih.gov/pubmed/23806107
http://dx.doi.org/10.1186/1471-2105-14-210
work_keys_str_mv AT huangyentsung genesetanalysisusingvariancecomponenttests
AT linxihong genesetanalysisusingvariancecomponenttests