Cargando…

Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16

Recently, gene set analysis (GSA) has been extended from use on gene expression data to use on single-nucleotide polymorphism (SNP) data in genome-wide association studies. When GSA has been demonstrated on SNP data, two popular statistics from gene expression data analysis (gene set enrichment anal...

Descripción completa

Detalles Bibliográficos
Autores principales: Tintle, Nathan L, Borchers, Bryce, Brown, Marshall, Bekmetjev, Airat
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796000/
https://www.ncbi.nlm.nih.gov/pubmed/20018093
_version_ 1782175489396310016
author Tintle, Nathan L
Borchers, Bryce
Brown, Marshall
Bekmetjev, Airat
author_facet Tintle, Nathan L
Borchers, Bryce
Brown, Marshall
Bekmetjev, Airat
author_sort Tintle, Nathan L
collection PubMed
description Recently, gene set analysis (GSA) has been extended from use on gene expression data to use on single-nucleotide polymorphism (SNP) data in genome-wide association studies. When GSA has been demonstrated on SNP data, two popular statistics from gene expression data analysis (gene set enrichment analysis [GSEA] and Fisher's exact test [FET]) have been used. However, GSEA and FET have shown a lack of power and robustness in the analysis of gene expression data. The purpose of this work is to investigate whether the same issues are also true for the analysis of SNP data. Ultimately, we conclude that GSEA and FET are not optimal for the analysis of SNP data when compared with the SUMSTAT method. In analysis of real SNP data from the Framingham Heart Study, we find that SUMSTAT finds many more gene sets to be significant when compared with other methods. In an analysis of simulated data, SUMSTAT demonstrates high power and better control of the type I error rate. GSA is a promising approach to the analysis of SNP data in GWAS and use of the SUMSTAT statistic instead of GSEA or FET may increase power and robustness.
format Text
id pubmed-2796000
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27960002009-12-18 Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16 Tintle, Nathan L Borchers, Bryce Brown, Marshall Bekmetjev, Airat BMC Proc Proceedings Recently, gene set analysis (GSA) has been extended from use on gene expression data to use on single-nucleotide polymorphism (SNP) data in genome-wide association studies. When GSA has been demonstrated on SNP data, two popular statistics from gene expression data analysis (gene set enrichment analysis [GSEA] and Fisher's exact test [FET]) have been used. However, GSEA and FET have shown a lack of power and robustness in the analysis of gene expression data. The purpose of this work is to investigate whether the same issues are also true for the analysis of SNP data. Ultimately, we conclude that GSEA and FET are not optimal for the analysis of SNP data when compared with the SUMSTAT method. In analysis of real SNP data from the Framingham Heart Study, we find that SUMSTAT finds many more gene sets to be significant when compared with other methods. In an analysis of simulated data, SUMSTAT demonstrates high power and better control of the type I error rate. GSA is a promising approach to the analysis of SNP data in GWAS and use of the SUMSTAT statistic instead of GSEA or FET may increase power and robustness. BioMed Central 2009-12-15 /pmc/articles/PMC2796000/ /pubmed/20018093 Text en Copyright ©2009 Tintle et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Tintle, Nathan L
Borchers, Bryce
Brown, Marshall
Bekmetjev, Airat
Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16
title Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16
title_full Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16
title_fullStr Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16
title_full_unstemmed Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16
title_short Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16
title_sort comparing gene set analysis methods on single-nucleotide polymorphism data from genetic analysis workshop 16
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796000/
https://www.ncbi.nlm.nih.gov/pubmed/20018093
work_keys_str_mv AT tintlenathanl comparinggenesetanalysismethodsonsinglenucleotidepolymorphismdatafromgeneticanalysisworkshop16
AT borchersbryce comparinggenesetanalysismethodsonsinglenucleotidepolymorphismdatafromgeneticanalysisworkshop16
AT brownmarshall comparinggenesetanalysismethodsonsinglenucleotidepolymorphismdatafromgeneticanalysisworkshop16
AT bekmetjevairat comparinggenesetanalysismethodsonsinglenucleotidepolymorphismdatafromgeneticanalysisworkshop16