Cargando…

Optimization of gene set annotations via entropy minimization over variable clusters (EMVC)

Motivation: Gene set enrichment has become a critical tool for interpreting the results of high-throughput genomic experiments. Inconsistent annotation quality and lack of annotation specificity, however, limit the statistical power of enrichment methods and make it difficult to replicate enrichment...

Descripción completa

Detalles Bibliográficos
Autores principales: Frost, H. Robert, Moore, Jason H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058919/
https://www.ncbi.nlm.nih.gov/pubmed/24574114
http://dx.doi.org/10.1093/bioinformatics/btu110
Descripción
Sumario:Motivation: Gene set enrichment has become a critical tool for interpreting the results of high-throughput genomic experiments. Inconsistent annotation quality and lack of annotation specificity, however, limit the statistical power of enrichment methods and make it difficult to replicate enrichment results across biologically similar datasets. Results: We propose a novel algorithm for optimizing gene set annotations to best match the structure of specific empirical data sources. Our proposed method, entropy minimization over variable clusters (EMVC), filters the annotations for each gene set to minimize a measure of entropy across disjoint gene clusters computed for a range of cluster sizes over multiple bootstrap resampled datasets. As shown using simulated gene sets with simulated data and Molecular Signatures Database collections with microarray gene expression data, the EMVC algorithm accurately filters annotations unrelated to the experimental outcome resulting in increased gene set enrichment power and better replication of enrichment results. Availability and implementation: http://cran.r-project.org/web/packages/EMVC/index.html. Contact: jason.h.moore@dartmouth.edu Supplementary information: Supplementary data are available at Bioinformatics online.