Cargando…

A variance component method for integrated pathway analysis of gene expression data

BACKGROUND: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associa...

Descripción completa

Detalles Bibliográficos
Autores principales: Quillen, Ellen E., Blangero, John, Almasy, Laura
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5133490/
https://www.ncbi.nlm.nih.gov/pubmed/27980659
http://dx.doi.org/10.1186/s12919-016-0053-6
_version_ 1782471272984215552
author Quillen, Ellen E.
Blangero, John
Almasy, Laura
author_facet Quillen, Ellen E.
Blangero, John
Almasy, Laura
author_sort Quillen, Ellen E.
collection PubMed
description BACKGROUND: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associated with the trait of interest. In contrast, this variance-component–based approach creates a similarity matrix of individuals based on the expression of genes in each pathway. METHODS: We compared 16 methods of calculating similarity for positive control matrices based on probes for the genes used to model the simulated Genetic Analysis Workshop phenotypes. RESULTS: A simple correlation matrix outperforms the other methods by identifying pathways associated with the simulated phenotypes at nearly twice the rate expected based on the associations of the component transcripts and an approximate false-positive rate of 0.05. CONCLUSIONS: This method has a number of additional advantages compared to single-transcript and pathway overrepresentation analyses, including the ability to estimate the proportion of variation explained by each pathway and the logistical advantage of only calculating the distance matrices once for each messenger RNA data set regardless of the number of phenotypes. Additionally, it offers a significant reduction in the multiple testing burden over individual consideration of each probe.
format Online
Article
Text
id pubmed-5133490
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51334902016-12-15 A variance component method for integrated pathway analysis of gene expression data Quillen, Ellen E. Blangero, John Almasy, Laura BMC Proc Proceedings BACKGROUND: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associated with the trait of interest. In contrast, this variance-component–based approach creates a similarity matrix of individuals based on the expression of genes in each pathway. METHODS: We compared 16 methods of calculating similarity for positive control matrices based on probes for the genes used to model the simulated Genetic Analysis Workshop phenotypes. RESULTS: A simple correlation matrix outperforms the other methods by identifying pathways associated with the simulated phenotypes at nearly twice the rate expected based on the associations of the component transcripts and an approximate false-positive rate of 0.05. CONCLUSIONS: This method has a number of additional advantages compared to single-transcript and pathway overrepresentation analyses, including the ability to estimate the proportion of variation explained by each pathway and the logistical advantage of only calculating the distance matrices once for each messenger RNA data set regardless of the number of phenotypes. Additionally, it offers a significant reduction in the multiple testing burden over individual consideration of each probe. BioMed Central 2016-10-18 /pmc/articles/PMC5133490/ /pubmed/27980659 http://dx.doi.org/10.1186/s12919-016-0053-6 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Quillen, Ellen E.
Blangero, John
Almasy, Laura
A variance component method for integrated pathway analysis of gene expression data
title A variance component method for integrated pathway analysis of gene expression data
title_full A variance component method for integrated pathway analysis of gene expression data
title_fullStr A variance component method for integrated pathway analysis of gene expression data
title_full_unstemmed A variance component method for integrated pathway analysis of gene expression data
title_short A variance component method for integrated pathway analysis of gene expression data
title_sort variance component method for integrated pathway analysis of gene expression data
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5133490/
https://www.ncbi.nlm.nih.gov/pubmed/27980659
http://dx.doi.org/10.1186/s12919-016-0053-6
work_keys_str_mv AT quillenellene avariancecomponentmethodforintegratedpathwayanalysisofgeneexpressiondata
AT blangerojohn avariancecomponentmethodforintegratedpathwayanalysisofgeneexpressiondata
AT almasylaura avariancecomponentmethodforintegratedpathwayanalysisofgeneexpressiondata
AT quillenellene variancecomponentmethodforintegratedpathwayanalysisofgeneexpressiondata
AT blangerojohn variancecomponentmethodforintegratedpathwayanalysisofgeneexpressiondata
AT almasylaura variancecomponentmethodforintegratedpathwayanalysisofgeneexpressiondata