Cargando…

Linear combination test for gene set analysis of a continuous phenotype

BACKGROUND: Gene set analysis (GSA) methods test the association of sets of genes with a phenotype in gene expression microarray studies. Many GSA methods have been proposed, especially methods for use with a binary phenotype. Equally, if not more importantly however, is the ability to test the enri...

Descripción completa

Detalles Bibliográficos
Autores principales: Dinu, Irina, Wang, Xiaoming, Kelemen, Linda E, Vatanpour, Shabnam, Pyne, Saumyadipta
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3717275/
https://www.ncbi.nlm.nih.gov/pubmed/23815123
http://dx.doi.org/10.1186/1471-2105-14-212
_version_ 1782277679226028032
author Dinu, Irina
Wang, Xiaoming
Kelemen, Linda E
Vatanpour, Shabnam
Pyne, Saumyadipta
author_facet Dinu, Irina
Wang, Xiaoming
Kelemen, Linda E
Vatanpour, Shabnam
Pyne, Saumyadipta
author_sort Dinu, Irina
collection PubMed
description BACKGROUND: Gene set analysis (GSA) methods test the association of sets of genes with a phenotype in gene expression microarray studies. Many GSA methods have been proposed, especially methods for use with a binary phenotype. Equally, if not more importantly however, is the ability to test the enrichment of a gene signature or pathway against the continuous phenotypes which are routinely and commonly observed in, for example, clinicopathological measurements. It is not always easy or meaningful to dichotomize continuous phenotypes into two classes, and attempting to do this may lead to the inaccurate classification of samples, which would affect the downstream enrichment analysis. In the present study, we have build on recent efforts to incorporate correlation structure within gene sets and pathways into the GSA test statistic. To address the issue of continuous phenotypes directly without the need for artificial discrete classification and thus increase the power of the test while ensuring computational efficiency and rigor, new GSA methods that can incorporate a covariance matrix estimator for a continuous phenotype may present an effective approach. RESULTS: We have designed a new method by extending the GSA approach called Linear Combination Test (LCT) from a binary to a continuous phenotype. Simulation studies and a real microarray dataset were used to compare the proposed LCT for a continuous phenotype, a modification of LCT (referred to as LCT(2)), and two publicly available GSA methods for continuous phenotypes. CONCLUSIONS: We found that the LCT methods performed better than the other two GSA methods; however, this finding should be understood in the context of our specific simulation studies and the real microarray dataset that were used to compare the methods. Free R-codes to perform LCT for binary and continuous phenotypes are available at http://www.ualberta.ca/~yyasui/homepage.html. The R-code to perform LCT for a continuous phenotype is available as Additional file 1.
format Online
Article
Text
id pubmed-3717275
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37172752013-07-22 Linear combination test for gene set analysis of a continuous phenotype Dinu, Irina Wang, Xiaoming Kelemen, Linda E Vatanpour, Shabnam Pyne, Saumyadipta BMC Bioinformatics Methodology Article BACKGROUND: Gene set analysis (GSA) methods test the association of sets of genes with a phenotype in gene expression microarray studies. Many GSA methods have been proposed, especially methods for use with a binary phenotype. Equally, if not more importantly however, is the ability to test the enrichment of a gene signature or pathway against the continuous phenotypes which are routinely and commonly observed in, for example, clinicopathological measurements. It is not always easy or meaningful to dichotomize continuous phenotypes into two classes, and attempting to do this may lead to the inaccurate classification of samples, which would affect the downstream enrichment analysis. In the present study, we have build on recent efforts to incorporate correlation structure within gene sets and pathways into the GSA test statistic. To address the issue of continuous phenotypes directly without the need for artificial discrete classification and thus increase the power of the test while ensuring computational efficiency and rigor, new GSA methods that can incorporate a covariance matrix estimator for a continuous phenotype may present an effective approach. RESULTS: We have designed a new method by extending the GSA approach called Linear Combination Test (LCT) from a binary to a continuous phenotype. Simulation studies and a real microarray dataset were used to compare the proposed LCT for a continuous phenotype, a modification of LCT (referred to as LCT(2)), and two publicly available GSA methods for continuous phenotypes. CONCLUSIONS: We found that the LCT methods performed better than the other two GSA methods; however, this finding should be understood in the context of our specific simulation studies and the real microarray dataset that were used to compare the methods. Free R-codes to perform LCT for binary and continuous phenotypes are available at http://www.ualberta.ca/~yyasui/homepage.html. The R-code to perform LCT for a continuous phenotype is available as Additional file 1. BioMed Central 2013-07-01 /pmc/articles/PMC3717275/ /pubmed/23815123 http://dx.doi.org/10.1186/1471-2105-14-212 Text en Copyright © 2013 Dinu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Dinu, Irina
Wang, Xiaoming
Kelemen, Linda E
Vatanpour, Shabnam
Pyne, Saumyadipta
Linear combination test for gene set analysis of a continuous phenotype
title Linear combination test for gene set analysis of a continuous phenotype
title_full Linear combination test for gene set analysis of a continuous phenotype
title_fullStr Linear combination test for gene set analysis of a continuous phenotype
title_full_unstemmed Linear combination test for gene set analysis of a continuous phenotype
title_short Linear combination test for gene set analysis of a continuous phenotype
title_sort linear combination test for gene set analysis of a continuous phenotype
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3717275/
https://www.ncbi.nlm.nih.gov/pubmed/23815123
http://dx.doi.org/10.1186/1471-2105-14-212
work_keys_str_mv AT dinuirina linearcombinationtestforgenesetanalysisofacontinuousphenotype
AT wangxiaoming linearcombinationtestforgenesetanalysisofacontinuousphenotype
AT kelemenlindae linearcombinationtestforgenesetanalysisofacontinuousphenotype
AT vatanpourshabnam linearcombinationtestforgenesetanalysisofacontinuousphenotype
AT pynesaumyadipta linearcombinationtestforgenesetanalysisofacontinuousphenotype