Cargando…

Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods

Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should b...

Descripción completa

Detalles Bibliográficos
Autores principales: Väremo, Leif, Nielsen, Jens, Nookaew, Intawat
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3632109/
https://www.ncbi.nlm.nih.gov/pubmed/23444143
http://dx.doi.org/10.1093/nar/gkt111
_version_ 1782266839476207616
author Väremo, Leif
Nielsen, Jens
Nookaew, Intawat
author_facet Väremo, Leif
Nielsen, Jens
Nookaew, Intawat
author_sort Väremo, Leif
collection PubMed
description Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, we suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation.
format Online
Article
Text
id pubmed-3632109
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36321092013-04-22 Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods Väremo, Leif Nielsen, Jens Nookaew, Intawat Nucleic Acids Res Computational Biology Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, we suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation. Oxford University Press 2013-04 2013-02-26 /pmc/articles/PMC3632109/ /pubmed/23444143 http://dx.doi.org/10.1093/nar/gkt111 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Väremo, Leif
Nielsen, Jens
Nookaew, Intawat
Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
title Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
title_full Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
title_fullStr Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
title_full_unstemmed Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
title_short Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
title_sort enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3632109/
https://www.ncbi.nlm.nih.gov/pubmed/23444143
http://dx.doi.org/10.1093/nar/gkt111
work_keys_str_mv AT varemoleif enrichingthegenesetanalysisofgenomewidedatabyincorporatingdirectionalityofgeneexpressionandcombiningstatisticalhypothesesandmethods
AT nielsenjens enrichingthegenesetanalysisofgenomewidedatabyincorporatingdirectionalityofgeneexpressionandcombiningstatisticalhypothesesandmethods
AT nookaewintawat enrichingthegenesetanalysisofgenomewidedatabyincorporatingdirectionalityofgeneexpressionandcombiningstatisticalhypothesesandmethods