Cargando…
Longitudinal linear combination test for gene set analysis
BACKGROUND: Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper ana...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902471/ https://www.ncbi.nlm.nih.gov/pubmed/31822265 http://dx.doi.org/10.1186/s12859-019-3221-7 |
_version_ | 1783477674433314816 |
---|---|
author | Khodayari Moez, Elham Hajihosseini, Morteza Andrews, Jeffrey L. Dinu, Irina |
author_facet | Khodayari Moez, Elham Hajihosseini, Morteza Andrews, Jeffrey L. Dinu, Irina |
author_sort | Khodayari Moez, Elham |
collection | PubMed |
description | BACKGROUND: Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper analysis methods. The primary challenge in the analysis of complex microarray study data is handling the correlation structure within data while also dealing with the combination of large number of genetic measurements and small number of subjects that are ubiquitous even in standard microarray studies. Motivated by the lack of available methods for analysis of repeatedly measured phenotypic or transcriptomic data, herein we develop a longitudinal linear combination test (LLCT). RESULTS: LLCT is a two-step method to analyze multiple longitudinal phenotypes when there is high dimensionality in response and/or explanatory variables. Alternating between calculating within-subjects and between-subjects variations in two steps, LLCT examines if the maximum possible correlation between a linear combination of the time trends and a linear combination of the predictors given by the gene expressions is statistically significant. A generalization of this method can handle family-based study designs when the subjects are not independent. This method is also applicable to time-course microarray, with the ability to identify gene sets that exhibit significantly different expression patterns over time. Based on the results from a simulation study, LLCT outperformed its alternative: pathway analysis via regression. LLCT was shown to be very powerful in the analysis of large gene sets even when the sample size is small. CONCLUSIONS: This self-contained pathway analysis method is applicable to a wide range of longitudinal genomics, proteomics, metabolomics (OMICS) data, allows adjusting for potentially time-dependent covariates and works well with unbalanced and incomplete data. An important potential application of this method could be time-course linkage of OMICS, an attractive possibility for future genetic researchers. Availability: R package of LLCT is available at: https://github.com/its-likeli-jeff/LLCT |
format | Online Article Text |
id | pubmed-6902471 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69024712019-12-11 Longitudinal linear combination test for gene set analysis Khodayari Moez, Elham Hajihosseini, Morteza Andrews, Jeffrey L. Dinu, Irina BMC Bioinformatics Methodology Article BACKGROUND: Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper analysis methods. The primary challenge in the analysis of complex microarray study data is handling the correlation structure within data while also dealing with the combination of large number of genetic measurements and small number of subjects that are ubiquitous even in standard microarray studies. Motivated by the lack of available methods for analysis of repeatedly measured phenotypic or transcriptomic data, herein we develop a longitudinal linear combination test (LLCT). RESULTS: LLCT is a two-step method to analyze multiple longitudinal phenotypes when there is high dimensionality in response and/or explanatory variables. Alternating between calculating within-subjects and between-subjects variations in two steps, LLCT examines if the maximum possible correlation between a linear combination of the time trends and a linear combination of the predictors given by the gene expressions is statistically significant. A generalization of this method can handle family-based study designs when the subjects are not independent. This method is also applicable to time-course microarray, with the ability to identify gene sets that exhibit significantly different expression patterns over time. Based on the results from a simulation study, LLCT outperformed its alternative: pathway analysis via regression. LLCT was shown to be very powerful in the analysis of large gene sets even when the sample size is small. CONCLUSIONS: This self-contained pathway analysis method is applicable to a wide range of longitudinal genomics, proteomics, metabolomics (OMICS) data, allows adjusting for potentially time-dependent covariates and works well with unbalanced and incomplete data. An important potential application of this method could be time-course linkage of OMICS, an attractive possibility for future genetic researchers. Availability: R package of LLCT is available at: https://github.com/its-likeli-jeff/LLCT BioMed Central 2019-12-10 /pmc/articles/PMC6902471/ /pubmed/31822265 http://dx.doi.org/10.1186/s12859-019-3221-7 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Khodayari Moez, Elham Hajihosseini, Morteza Andrews, Jeffrey L. Dinu, Irina Longitudinal linear combination test for gene set analysis |
title | Longitudinal linear combination test for gene set analysis |
title_full | Longitudinal linear combination test for gene set analysis |
title_fullStr | Longitudinal linear combination test for gene set analysis |
title_full_unstemmed | Longitudinal linear combination test for gene set analysis |
title_short | Longitudinal linear combination test for gene set analysis |
title_sort | longitudinal linear combination test for gene set analysis |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902471/ https://www.ncbi.nlm.nih.gov/pubmed/31822265 http://dx.doi.org/10.1186/s12859-019-3221-7 |
work_keys_str_mv | AT khodayarimoezelham longitudinallinearcombinationtestforgenesetanalysis AT hajihosseinimorteza longitudinallinearcombinationtestforgenesetanalysis AT andrewsjeffreyl longitudinallinearcombinationtestforgenesetanalysis AT dinuirina longitudinallinearcombinationtestforgenesetanalysis |