Cargando…

Avoiding the pitfalls of gene set enrichment analysis with SetRank

BACKGROUND: The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses. RESULTS: Here we present SetRank, an advanced GSEA algorithm which is able to eliminate many f...

Descripción completa

Detalles Bibliográficos
Autores principales: Simillion, Cedric, Liechti, Robin, Lischer, Heidi E.L., Ioannidis, Vassilios, Bruggmann, Rémy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5336655/
https://www.ncbi.nlm.nih.gov/pubmed/28259142
http://dx.doi.org/10.1186/s12859-017-1571-6
_version_ 1782512232057274368
author Simillion, Cedric
Liechti, Robin
Lischer, Heidi E.L.
Ioannidis, Vassilios
Bruggmann, Rémy
author_facet Simillion, Cedric
Liechti, Robin
Lischer, Heidi E.L.
Ioannidis, Vassilios
Bruggmann, Rémy
author_sort Simillion, Cedric
collection PubMed
description BACKGROUND: The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses. RESULTS: Here we present SetRank, an advanced GSEA algorithm which is able to eliminate many false positive hits. The key principle of the algorithm is that it discards gene sets that have initially been flagged as significant, if their significance is only due to the overlap with another gene set. The algorithm is explained in detail and its performance is compared to that of other methods using objective benchmarking criteria. Furthermore, we explore how sample source bias can affect the results of a GSEA analysis. CONCLUSIONS: The benchmarking results show that SetRank is a highly specific tool for GSEA. Furthermore, we show that the reliability of results can be improved by taking sample source bias into account. SetRank is available as an R package and through an online web interface.
format Online
Article
Text
id pubmed-5336655
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53366552017-03-07 Avoiding the pitfalls of gene set enrichment analysis with SetRank Simillion, Cedric Liechti, Robin Lischer, Heidi E.L. Ioannidis, Vassilios Bruggmann, Rémy BMC Bioinformatics Methodology Article BACKGROUND: The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses. RESULTS: Here we present SetRank, an advanced GSEA algorithm which is able to eliminate many false positive hits. The key principle of the algorithm is that it discards gene sets that have initially been flagged as significant, if their significance is only due to the overlap with another gene set. The algorithm is explained in detail and its performance is compared to that of other methods using objective benchmarking criteria. Furthermore, we explore how sample source bias can affect the results of a GSEA analysis. CONCLUSIONS: The benchmarking results show that SetRank is a highly specific tool for GSEA. Furthermore, we show that the reliability of results can be improved by taking sample source bias into account. SetRank is available as an R package and through an online web interface. BioMed Central 2017-03-04 /pmc/articles/PMC5336655/ /pubmed/28259142 http://dx.doi.org/10.1186/s12859-017-1571-6 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Simillion, Cedric
Liechti, Robin
Lischer, Heidi E.L.
Ioannidis, Vassilios
Bruggmann, Rémy
Avoiding the pitfalls of gene set enrichment analysis with SetRank
title Avoiding the pitfalls of gene set enrichment analysis with SetRank
title_full Avoiding the pitfalls of gene set enrichment analysis with SetRank
title_fullStr Avoiding the pitfalls of gene set enrichment analysis with SetRank
title_full_unstemmed Avoiding the pitfalls of gene set enrichment analysis with SetRank
title_short Avoiding the pitfalls of gene set enrichment analysis with SetRank
title_sort avoiding the pitfalls of gene set enrichment analysis with setrank
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5336655/
https://www.ncbi.nlm.nih.gov/pubmed/28259142
http://dx.doi.org/10.1186/s12859-017-1571-6
work_keys_str_mv AT simillioncedric avoidingthepitfallsofgenesetenrichmentanalysiswithsetrank
AT liechtirobin avoidingthepitfallsofgenesetenrichmentanalysiswithsetrank
AT lischerheidiel avoidingthepitfallsofgenesetenrichmentanalysiswithsetrank
AT ioannidisvassilios avoidingthepitfallsofgenesetenrichmentanalysiswithsetrank
AT bruggmannremy avoidingthepitfallsofgenesetenrichmentanalysiswithsetrank