Cargando…

Time-Course Gene Set Analysis for Longitudinal Gene Expression Data

Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA) introduced here is an extension of gene set analysis to longit...

Descripción completa

Detalles Bibliográficos
Autores principales: Hejblum, Boris P., Skinner, Jason, Thiébaut, Rodolphe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4482329/
https://www.ncbi.nlm.nih.gov/pubmed/26111374
http://dx.doi.org/10.1371/journal.pcbi.1004310
_version_ 1782378424242798592
author Hejblum, Boris P.
Skinner, Jason
Thiébaut, Rodolphe
author_facet Hejblum, Boris P.
Skinner, Jason
Thiébaut, Rodolphe
author_sort Hejblum, Boris P.
collection PubMed
description Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA) introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR) measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial), and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA) for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package.
format Online
Article
Text
id pubmed-4482329
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44823292015-07-01 Time-Course Gene Set Analysis for Longitudinal Gene Expression Data Hejblum, Boris P. Skinner, Jason Thiébaut, Rodolphe PLoS Comput Biol Research Article Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA) introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR) measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial), and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA) for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package. Public Library of Science 2015-06-25 /pmc/articles/PMC4482329/ /pubmed/26111374 http://dx.doi.org/10.1371/journal.pcbi.1004310 Text en © 2015 Hejblum et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hejblum, Boris P.
Skinner, Jason
Thiébaut, Rodolphe
Time-Course Gene Set Analysis for Longitudinal Gene Expression Data
title Time-Course Gene Set Analysis for Longitudinal Gene Expression Data
title_full Time-Course Gene Set Analysis for Longitudinal Gene Expression Data
title_fullStr Time-Course Gene Set Analysis for Longitudinal Gene Expression Data
title_full_unstemmed Time-Course Gene Set Analysis for Longitudinal Gene Expression Data
title_short Time-Course Gene Set Analysis for Longitudinal Gene Expression Data
title_sort time-course gene set analysis for longitudinal gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4482329/
https://www.ncbi.nlm.nih.gov/pubmed/26111374
http://dx.doi.org/10.1371/journal.pcbi.1004310
work_keys_str_mv AT hejblumborisp timecoursegenesetanalysisforlongitudinalgeneexpressiondata
AT skinnerjason timecoursegenesetanalysisforlongitudinalgeneexpressiondata
AT thiebautrodolphe timecoursegenesetanalysisforlongitudinalgeneexpressiondata