Cargando…

De-correlating expression in gene-set analysis

Motivation: Group-wise pattern analysis of genes, known as gene-set analysis (GSA), addresses the differential expression pattern of biologically pre-defined gene sets. GSA exhibits high statistical power and has revealed many novel biological processes associated with specific phenotypes. In most c...

Descripción completa

Detalles Bibliográficos
Autor principal: Nam, Dougu
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935420/
https://www.ncbi.nlm.nih.gov/pubmed/20823315
http://dx.doi.org/10.1093/bioinformatics/btq380
Descripción
Sumario:Motivation: Group-wise pattern analysis of genes, known as gene-set analysis (GSA), addresses the differential expression pattern of biologically pre-defined gene sets. GSA exhibits high statistical power and has revealed many novel biological processes associated with specific phenotypes. In most cases, however, GSA relies on the invalid assumption that the members of each gene set are sampled independently, which increases false predictions. Results: We propose an algorithm, termed DECO, to remove (or alleviate) the bias caused by the correlation of the expression data in GSAs. This is accomplished through the eigenvalue-decomposition of covariance matrixes and a series of linear transformations of data. In particular, moderate de-correlation methods that truncate or re-scale eigenvalues were proposed for a more reliable analysis. Tests of simulated and real experimental data show that DECO effectively corrects the correlation structure of gene expression and improves the prediction accuracy (specificity and sensitivity) for both gene- and sample-randomizing GSA methods. Availability: The MATLAB codes and the tested data sets are available at ftp://deco.nims.re.kr/pub or from the author. Contact: dougnam@unist.ac.kr