Cargando…

A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity

Identification of functional sets of genes associated with conditions of interest from omics data was first reported in 1999, and since, a plethora of enrichment methods were published for systematic analysis of gene sets collections including Gene Ontology and biological pathways. Despite their wid...

Descripción completa

Detalles Bibliográficos
Autores principales: Tarca, Adi L., Bhatti, Gaurav, Romero, Roberto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3829842/
https://www.ncbi.nlm.nih.gov/pubmed/24260172
http://dx.doi.org/10.1371/journal.pone.0079217
_version_ 1782291400658780160
author Tarca, Adi L.
Bhatti, Gaurav
Romero, Roberto
author_facet Tarca, Adi L.
Bhatti, Gaurav
Romero, Roberto
author_sort Tarca, Adi L.
collection PubMed
description Identification of functional sets of genes associated with conditions of interest from omics data was first reported in 1999, and since, a plethora of enrichment methods were published for systematic analysis of gene sets collections including Gene Ontology and biological pathways. Despite their widespread usage in reducing the complexity of omics experiment results, their performance is poorly understood. Leveraging the existence of disease specific gene sets in KEGG and Metacore® databases, we compared the performance of sixteen methods under relaxed assumptions while using 42 real datasets (over 1,400 samples). Most of the methods ranked high the gene sets designed for specific diseases whenever samples from affected individuals were compared against controls via microarrays. The top methods for gene set prioritization were different from the top ones in terms of sensitivity, and four of the sixteen methods had large false positives rates assessed by permuting the phenotype of the samples. The best overall methods among those that generated reasonably low false positive rates, when permuting phenotypes, were PLAGE, GLOBALTEST, and PADOG. The best method in the category that generated higher than expected false positives was MRGSE.
format Online
Article
Text
id pubmed-3829842
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38298422013-11-20 A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity Tarca, Adi L. Bhatti, Gaurav Romero, Roberto PLoS One Research Article Identification of functional sets of genes associated with conditions of interest from omics data was first reported in 1999, and since, a plethora of enrichment methods were published for systematic analysis of gene sets collections including Gene Ontology and biological pathways. Despite their widespread usage in reducing the complexity of omics experiment results, their performance is poorly understood. Leveraging the existence of disease specific gene sets in KEGG and Metacore® databases, we compared the performance of sixteen methods under relaxed assumptions while using 42 real datasets (over 1,400 samples). Most of the methods ranked high the gene sets designed for specific diseases whenever samples from affected individuals were compared against controls via microarrays. The top methods for gene set prioritization were different from the top ones in terms of sensitivity, and four of the sixteen methods had large false positives rates assessed by permuting the phenotype of the samples. The best overall methods among those that generated reasonably low false positive rates, when permuting phenotypes, were PLAGE, GLOBALTEST, and PADOG. The best method in the category that generated higher than expected false positives was MRGSE. Public Library of Science 2013-11-15 /pmc/articles/PMC3829842/ /pubmed/24260172 http://dx.doi.org/10.1371/journal.pone.0079217 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Tarca, Adi L.
Bhatti, Gaurav
Romero, Roberto
A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity
title A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity
title_full A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity
title_fullStr A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity
title_full_unstemmed A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity
title_short A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity
title_sort comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3829842/
https://www.ncbi.nlm.nih.gov/pubmed/24260172
http://dx.doi.org/10.1371/journal.pone.0079217
work_keys_str_mv AT tarcaadil acomparisonofgenesetanalysismethodsintermsofsensitivityprioritizationandspecificity
AT bhattigaurav acomparisonofgenesetanalysismethodsintermsofsensitivityprioritizationandspecificity
AT romeroroberto acomparisonofgenesetanalysismethodsintermsofsensitivityprioritizationandspecificity
AT tarcaadil comparisonofgenesetanalysismethodsintermsofsensitivityprioritizationandspecificity
AT bhattigaurav comparisonofgenesetanalysismethodsintermsofsensitivityprioritizationandspecificity
AT romeroroberto comparisonofgenesetanalysismethodsintermsofsensitivityprioritizationandspecificity