Cargando…

Gene set analysis for longitudinal gene expression data

BACKGROUND: Gene set analysis (GSA) has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes....

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Ke, Wang, Haiyan, Bathke, Arne C, Harrar, Solomon W, Piepho, Hans-Peter, Deng, Youping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3142525/
https://www.ncbi.nlm.nih.gov/pubmed/21722407
http://dx.doi.org/10.1186/1471-2105-12-273
_version_ 1782208839042465792
author Zhang, Ke
Wang, Haiyan
Bathke, Arne C
Harrar, Solomon W
Piepho, Hans-Peter
Deng, Youping
author_facet Zhang, Ke
Wang, Haiyan
Bathke, Arne C
Harrar, Solomon W
Piepho, Hans-Peter
Deng, Youping
author_sort Zhang, Ke
collection PubMed
description BACKGROUND: Gene set analysis (GSA) has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes. Nowadays, an increasing number of microarray studies are conducted to explore the dynamic changes of gene expression in a variety of species and biological scenarios. In these longitudinal studies, gene expression is repeatedly measured over time such that a GSA needs to take into account the within-gene correlations in addition to possible between-gene correlations. RESULTS: We provide a robust nonparametric approach to compare the expressions of longitudinally measured sets of genes under multiple treatments or experimental conditions. The limiting distributions of our statistics are derived when the number of genes goes to infinity while the number of replications can be small. When the number of genes in a gene set is small, we recommend permutation tests based on our nonparametric test statistics to achieve reliable type I error and better power while incorporating unknown correlations between and within-genes. Simulation results demonstrate that the proposed method has a greater power than other methods for various data distributions and heteroscedastic correlation structures. This method was used for an IL-2 stimulation study and significantly altered gene sets were identified. CONCLUSIONS: The simulation study and the real data application showed that the proposed gene set analysis provides a promising tool for longitudinal microarray analysis. R scripts for simulating longitudinal data and calculating the nonparametric statistics are posted on the North Dakota INBRE website http://ndinbre.org/programs/bioinformatics.php. Raw microarray data is available in Gene Expression Omnibus (National Center for Biotechnology Information) with accession number GSE6085.
format Online
Article
Text
id pubmed-3142525
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31425252011-07-24 Gene set analysis for longitudinal gene expression data Zhang, Ke Wang, Haiyan Bathke, Arne C Harrar, Solomon W Piepho, Hans-Peter Deng, Youping BMC Bioinformatics Methodology Article BACKGROUND: Gene set analysis (GSA) has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes. Nowadays, an increasing number of microarray studies are conducted to explore the dynamic changes of gene expression in a variety of species and biological scenarios. In these longitudinal studies, gene expression is repeatedly measured over time such that a GSA needs to take into account the within-gene correlations in addition to possible between-gene correlations. RESULTS: We provide a robust nonparametric approach to compare the expressions of longitudinally measured sets of genes under multiple treatments or experimental conditions. The limiting distributions of our statistics are derived when the number of genes goes to infinity while the number of replications can be small. When the number of genes in a gene set is small, we recommend permutation tests based on our nonparametric test statistics to achieve reliable type I error and better power while incorporating unknown correlations between and within-genes. Simulation results demonstrate that the proposed method has a greater power than other methods for various data distributions and heteroscedastic correlation structures. This method was used for an IL-2 stimulation study and significantly altered gene sets were identified. CONCLUSIONS: The simulation study and the real data application showed that the proposed gene set analysis provides a promising tool for longitudinal microarray analysis. R scripts for simulating longitudinal data and calculating the nonparametric statistics are posted on the North Dakota INBRE website http://ndinbre.org/programs/bioinformatics.php. Raw microarray data is available in Gene Expression Omnibus (National Center for Biotechnology Information) with accession number GSE6085. BioMed Central 2011-07-03 /pmc/articles/PMC3142525/ /pubmed/21722407 http://dx.doi.org/10.1186/1471-2105-12-273 Text en Copyright ©2011 Zhang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Zhang, Ke
Wang, Haiyan
Bathke, Arne C
Harrar, Solomon W
Piepho, Hans-Peter
Deng, Youping
Gene set analysis for longitudinal gene expression data
title Gene set analysis for longitudinal gene expression data
title_full Gene set analysis for longitudinal gene expression data
title_fullStr Gene set analysis for longitudinal gene expression data
title_full_unstemmed Gene set analysis for longitudinal gene expression data
title_short Gene set analysis for longitudinal gene expression data
title_sort gene set analysis for longitudinal gene expression data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3142525/
https://www.ncbi.nlm.nih.gov/pubmed/21722407
http://dx.doi.org/10.1186/1471-2105-12-273
work_keys_str_mv AT zhangke genesetanalysisforlongitudinalgeneexpressiondata
AT wanghaiyan genesetanalysisforlongitudinalgeneexpressiondata
AT bathkearnec genesetanalysisforlongitudinalgeneexpressiondata
AT harrarsolomonw genesetanalysisforlongitudinalgeneexpressiondata
AT piephohanspeter genesetanalysisforlongitudinalgeneexpressiondata
AT dengyouping genesetanalysisforlongitudinalgeneexpressiondata