Cargando…

Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis

BACKGROUND: Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signa...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yarong, Kort, Eric J., Ebrahimi, Nader, Zhang, Zhongfa, Teh, Bin T.
Formato: Texto
Lenguaje:English
Publicado: Libertas Academica 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816930/
https://www.ncbi.nlm.nih.gov/pubmed/20148167
_version_ 1782177156412997632
author Yang, Yarong
Kort, Eric J.
Ebrahimi, Nader
Zhang, Zhongfa
Teh, Bin T.
author_facet Yang, Yarong
Kort, Eric J.
Ebrahimi, Nader
Zhang, Zhongfa
Teh, Bin T.
author_sort Yang, Yarong
collection PubMed
description BACKGROUND: Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signatures. Because our approach uses the Kolmogorov-Smirnov approach both to define class specific signatures and to classify samples using those signatures, we have termed this methodology “Dual-KS” (DKS). RESULTS: The optimum gene signature identified by the DKS algorithm was smaller than other methods to which it was compared in 5 out of 10 datasets. The estimated error rate of DKS using the optimum gene signature was smaller than the estimated error rate of the random forest method in 4 out of the 10 datasets, and was equivalent in two additional datasets. DKS performance relative to other benchmarked algorithms was similar to its performance relative to random forests. CONCLUSIONS: DKS is an efficient analytic methodology that can identify highly parsimonious gene signatures useful for classification in the context of microarray studies. The algorithm is available as the dualKS package for R as part of the bioconductor project.
format Text
id pubmed-2816930
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-28169302010-02-10 Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis Yang, Yarong Kort, Eric J. Ebrahimi, Nader Zhang, Zhongfa Teh, Bin T. Cancer Inform Methodology BACKGROUND: Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signatures. Because our approach uses the Kolmogorov-Smirnov approach both to define class specific signatures and to classify samples using those signatures, we have termed this methodology “Dual-KS” (DKS). RESULTS: The optimum gene signature identified by the DKS algorithm was smaller than other methods to which it was compared in 5 out of 10 datasets. The estimated error rate of DKS using the optimum gene signature was smaller than the estimated error rate of the random forest method in 4 out of the 10 datasets, and was equivalent in two additional datasets. DKS performance relative to other benchmarked algorithms was similar to its performance relative to random forests. CONCLUSIONS: DKS is an efficient analytic methodology that can identify highly parsimonious gene signatures useful for classification in the context of microarray studies. The algorithm is available as the dualKS package for R as part of the bioconductor project. Libertas Academica 2010-01-21 /pmc/articles/PMC2816930/ /pubmed/20148167 Text en © 2010 The authors. http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Methodology
Yang, Yarong
Kort, Eric J.
Ebrahimi, Nader
Zhang, Zhongfa
Teh, Bin T.
Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
title Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
title_full Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
title_fullStr Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
title_full_unstemmed Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
title_short Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
title_sort dual ks: defining gene sets with tissue set enrichment analysis
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816930/
https://www.ncbi.nlm.nih.gov/pubmed/20148167
work_keys_str_mv AT yangyarong dualksdefininggenesetswithtissuesetenrichmentanalysis
AT kortericj dualksdefininggenesetswithtissuesetenrichmentanalysis
AT ebrahiminader dualksdefininggenesetswithtissuesetenrichmentanalysis
AT zhangzhongfa dualksdefininggenesetswithtissuesetenrichmentanalysis
AT tehbint dualksdefininggenesetswithtissuesetenrichmentanalysis