Cargando…
Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
BACKGROUND: Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signa...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816930/ https://www.ncbi.nlm.nih.gov/pubmed/20148167 |
_version_ | 1782177156412997632 |
---|---|
author | Yang, Yarong Kort, Eric J. Ebrahimi, Nader Zhang, Zhongfa Teh, Bin T. |
author_facet | Yang, Yarong Kort, Eric J. Ebrahimi, Nader Zhang, Zhongfa Teh, Bin T. |
author_sort | Yang, Yarong |
collection | PubMed |
description | BACKGROUND: Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signatures. Because our approach uses the Kolmogorov-Smirnov approach both to define class specific signatures and to classify samples using those signatures, we have termed this methodology “Dual-KS” (DKS). RESULTS: The optimum gene signature identified by the DKS algorithm was smaller than other methods to which it was compared in 5 out of 10 datasets. The estimated error rate of DKS using the optimum gene signature was smaller than the estimated error rate of the random forest method in 4 out of the 10 datasets, and was equivalent in two additional datasets. DKS performance relative to other benchmarked algorithms was similar to its performance relative to random forests. CONCLUSIONS: DKS is an efficient analytic methodology that can identify highly parsimonious gene signatures useful for classification in the context of microarray studies. The algorithm is available as the dualKS package for R as part of the bioconductor project. |
format | Text |
id | pubmed-2816930 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-28169302010-02-10 Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis Yang, Yarong Kort, Eric J. Ebrahimi, Nader Zhang, Zhongfa Teh, Bin T. Cancer Inform Methodology BACKGROUND: Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signatures. Because our approach uses the Kolmogorov-Smirnov approach both to define class specific signatures and to classify samples using those signatures, we have termed this methodology “Dual-KS” (DKS). RESULTS: The optimum gene signature identified by the DKS algorithm was smaller than other methods to which it was compared in 5 out of 10 datasets. The estimated error rate of DKS using the optimum gene signature was smaller than the estimated error rate of the random forest method in 4 out of the 10 datasets, and was equivalent in two additional datasets. DKS performance relative to other benchmarked algorithms was similar to its performance relative to random forests. CONCLUSIONS: DKS is an efficient analytic methodology that can identify highly parsimonious gene signatures useful for classification in the context of microarray studies. The algorithm is available as the dualKS package for R as part of the bioconductor project. Libertas Academica 2010-01-21 /pmc/articles/PMC2816930/ /pubmed/20148167 Text en © 2010 The authors. http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/). |
spellingShingle | Methodology Yang, Yarong Kort, Eric J. Ebrahimi, Nader Zhang, Zhongfa Teh, Bin T. Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis |
title | Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis |
title_full | Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis |
title_fullStr | Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis |
title_full_unstemmed | Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis |
title_short | Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis |
title_sort | dual ks: defining gene sets with tissue set enrichment analysis |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816930/ https://www.ncbi.nlm.nih.gov/pubmed/20148167 |
work_keys_str_mv | AT yangyarong dualksdefininggenesetswithtissuesetenrichmentanalysis AT kortericj dualksdefininggenesetswithtissuesetenrichmentanalysis AT ebrahiminader dualksdefininggenesetswithtissuesetenrichmentanalysis AT zhangzhongfa dualksdefininggenesetswithtissuesetenrichmentanalysis AT tehbint dualksdefininggenesetswithtissuesetenrichmentanalysis |