Cargando…

Microarray-based gene set analysis: a comparison of current methods

BACKGROUND: The analysis of gene sets has become a popular topic in recent times, with researchers attempting to improve the interpretability and reproducibility of their microarray analyses through the inclusion of supplementary biological information. While a number of options for gene set analysi...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Sarah, Black, Michael A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2607289/
https://www.ncbi.nlm.nih.gov/pubmed/19038052
http://dx.doi.org/10.1186/1471-2105-9-502
_version_ 1782163045422727168
author Song, Sarah
Black, Michael A
author_facet Song, Sarah
Black, Michael A
author_sort Song, Sarah
collection PubMed
description BACKGROUND: The analysis of gene sets has become a popular topic in recent times, with researchers attempting to improve the interpretability and reproducibility of their microarray analyses through the inclusion of supplementary biological information. While a number of options for gene set analysis exist, no consensus has yet been reached regarding which methodology performs best, and under what conditions. The goal of this work was to examine the performance characteristics of a collection of existing gene set analysis methods, on both simulated and real microarray data sets. Of particular interest was the potential utility gained through the incorporation of inter-gene correlation into the analysis process. RESULTS: Each of six gene set analysis methods was applied to both simulated and publicly available microarray data sets. Overall, the various methodologies were all found to be better at detecting gene sets that moved from non-active (i.e., genes not expressed) to active states (or vice versa), rather than those that simply changed their level of activity. Methods which incorporate correlation structures were found to provide increased ability to detect altered gene sets in some settings. CONCLUSION: Based on the results obtained through the analysis of simulated data, it is clear that the performance of gene set analysis methods is strongly influenced by the features of the data set in question, and that methods which incorporate correlation structures into the analysis process tend to achieve better performance, relative to methods which rely on univariate test statistics.
format Text
id pubmed-2607289
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26072892008-12-24 Microarray-based gene set analysis: a comparison of current methods Song, Sarah Black, Michael A BMC Bioinformatics Methodology Article BACKGROUND: The analysis of gene sets has become a popular topic in recent times, with researchers attempting to improve the interpretability and reproducibility of their microarray analyses through the inclusion of supplementary biological information. While a number of options for gene set analysis exist, no consensus has yet been reached regarding which methodology performs best, and under what conditions. The goal of this work was to examine the performance characteristics of a collection of existing gene set analysis methods, on both simulated and real microarray data sets. Of particular interest was the potential utility gained through the incorporation of inter-gene correlation into the analysis process. RESULTS: Each of six gene set analysis methods was applied to both simulated and publicly available microarray data sets. Overall, the various methodologies were all found to be better at detecting gene sets that moved from non-active (i.e., genes not expressed) to active states (or vice versa), rather than those that simply changed their level of activity. Methods which incorporate correlation structures were found to provide increased ability to detect altered gene sets in some settings. CONCLUSION: Based on the results obtained through the analysis of simulated data, it is clear that the performance of gene set analysis methods is strongly influenced by the features of the data set in question, and that methods which incorporate correlation structures into the analysis process tend to achieve better performance, relative to methods which rely on univariate test statistics. BioMed Central 2008-11-27 /pmc/articles/PMC2607289/ /pubmed/19038052 http://dx.doi.org/10.1186/1471-2105-9-502 Text en Copyright © 2008 Song and Black; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Song, Sarah
Black, Michael A
Microarray-based gene set analysis: a comparison of current methods
title Microarray-based gene set analysis: a comparison of current methods
title_full Microarray-based gene set analysis: a comparison of current methods
title_fullStr Microarray-based gene set analysis: a comparison of current methods
title_full_unstemmed Microarray-based gene set analysis: a comparison of current methods
title_short Microarray-based gene set analysis: a comparison of current methods
title_sort microarray-based gene set analysis: a comparison of current methods
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2607289/
https://www.ncbi.nlm.nih.gov/pubmed/19038052
http://dx.doi.org/10.1186/1471-2105-9-502
work_keys_str_mv AT songsarah microarraybasedgenesetanalysisacomparisonofcurrentmethods
AT blackmichaela microarraybasedgenesetanalysisacomparisonofcurrentmethods