Cargando…

Longitudinal linear combination test for gene set analysis

BACKGROUND: Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Khodayari Moez, Elham, Hajihosseini, Morteza, Andrews, Jeffrey L., Dinu, Irina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902471/
https://www.ncbi.nlm.nih.gov/pubmed/31822265
http://dx.doi.org/10.1186/s12859-019-3221-7
_version_ 1783477674433314816
author Khodayari Moez, Elham
Hajihosseini, Morteza
Andrews, Jeffrey L.
Dinu, Irina
author_facet Khodayari Moez, Elham
Hajihosseini, Morteza
Andrews, Jeffrey L.
Dinu, Irina
author_sort Khodayari Moez, Elham
collection PubMed
description BACKGROUND: Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper analysis methods. The primary challenge in the analysis of complex microarray study data is handling the correlation structure within data while also dealing with the combination of large number of genetic measurements and small number of subjects that are ubiquitous even in standard microarray studies. Motivated by the lack of available methods for analysis of repeatedly measured phenotypic or transcriptomic data, herein we develop a longitudinal linear combination test (LLCT). RESULTS: LLCT is a two-step method to analyze multiple longitudinal phenotypes when there is high dimensionality in response and/or explanatory variables. Alternating between calculating within-subjects and between-subjects variations in two steps, LLCT examines if the maximum possible correlation between a linear combination of the time trends and a linear combination of the predictors given by the gene expressions is statistically significant. A generalization of this method can handle family-based study designs when the subjects are not independent. This method is also applicable to time-course microarray, with the ability to identify gene sets that exhibit significantly different expression patterns over time. Based on the results from a simulation study, LLCT outperformed its alternative: pathway analysis via regression. LLCT was shown to be very powerful in the analysis of large gene sets even when the sample size is small. CONCLUSIONS: This self-contained pathway analysis method is applicable to a wide range of longitudinal genomics, proteomics, metabolomics (OMICS) data, allows adjusting for potentially time-dependent covariates and works well with unbalanced and incomplete data. An important potential application of this method could be time-course linkage of OMICS, an attractive possibility for future genetic researchers. Availability: R package of LLCT is available at: https://github.com/its-likeli-jeff/LLCT
format Online
Article
Text
id pubmed-6902471
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69024712019-12-11 Longitudinal linear combination test for gene set analysis Khodayari Moez, Elham Hajihosseini, Morteza Andrews, Jeffrey L. Dinu, Irina BMC Bioinformatics Methodology Article BACKGROUND: Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper analysis methods. The primary challenge in the analysis of complex microarray study data is handling the correlation structure within data while also dealing with the combination of large number of genetic measurements and small number of subjects that are ubiquitous even in standard microarray studies. Motivated by the lack of available methods for analysis of repeatedly measured phenotypic or transcriptomic data, herein we develop a longitudinal linear combination test (LLCT). RESULTS: LLCT is a two-step method to analyze multiple longitudinal phenotypes when there is high dimensionality in response and/or explanatory variables. Alternating between calculating within-subjects and between-subjects variations in two steps, LLCT examines if the maximum possible correlation between a linear combination of the time trends and a linear combination of the predictors given by the gene expressions is statistically significant. A generalization of this method can handle family-based study designs when the subjects are not independent. This method is also applicable to time-course microarray, with the ability to identify gene sets that exhibit significantly different expression patterns over time. Based on the results from a simulation study, LLCT outperformed its alternative: pathway analysis via regression. LLCT was shown to be very powerful in the analysis of large gene sets even when the sample size is small. CONCLUSIONS: This self-contained pathway analysis method is applicable to a wide range of longitudinal genomics, proteomics, metabolomics (OMICS) data, allows adjusting for potentially time-dependent covariates and works well with unbalanced and incomplete data. An important potential application of this method could be time-course linkage of OMICS, an attractive possibility for future genetic researchers. Availability: R package of LLCT is available at: https://github.com/its-likeli-jeff/LLCT BioMed Central 2019-12-10 /pmc/articles/PMC6902471/ /pubmed/31822265 http://dx.doi.org/10.1186/s12859-019-3221-7 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Khodayari Moez, Elham
Hajihosseini, Morteza
Andrews, Jeffrey L.
Dinu, Irina
Longitudinal linear combination test for gene set analysis
title Longitudinal linear combination test for gene set analysis
title_full Longitudinal linear combination test for gene set analysis
title_fullStr Longitudinal linear combination test for gene set analysis
title_full_unstemmed Longitudinal linear combination test for gene set analysis
title_short Longitudinal linear combination test for gene set analysis
title_sort longitudinal linear combination test for gene set analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902471/
https://www.ncbi.nlm.nih.gov/pubmed/31822265
http://dx.doi.org/10.1186/s12859-019-3221-7
work_keys_str_mv AT khodayarimoezelham longitudinallinearcombinationtestforgenesetanalysis
AT hajihosseinimorteza longitudinallinearcombinationtestforgenesetanalysis
AT andrewsjeffreyl longitudinallinearcombinationtestforgenesetanalysis
AT dinuirina longitudinallinearcombinationtestforgenesetanalysis