Cargando…

Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion

The prevailing method of analyzing GWAS data is still to test each marker individually, although from a statistical point of view it is quite obvious that in case of complex traits such single marker tests are not ideal. Recently several model selection approaches for GWAS have been suggested, most...

Descripción completa

Detalles Bibliográficos
Autores principales: Dolejsi, Erich, Bodenstorfer, Bernhard, Frommlet, Florian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111553/
https://www.ncbi.nlm.nih.gov/pubmed/25061809
http://dx.doi.org/10.1371/journal.pone.0103322
_version_ 1782328098989015040
author Dolejsi, Erich
Bodenstorfer, Bernhard
Frommlet, Florian
author_facet Dolejsi, Erich
Bodenstorfer, Bernhard
Frommlet, Florian
author_sort Dolejsi, Erich
collection PubMed
description The prevailing method of analyzing GWAS data is still to test each marker individually, although from a statistical point of view it is quite obvious that in case of complex traits such single marker tests are not ideal. Recently several model selection approaches for GWAS have been suggested, most of them based on LASSO-type procedures. Here we will discuss an alternative model selection approach which is based on a modification of the Bayesian Information Criterion (mBIC2) which was previously shown to have certain asymptotic optimality properties in terms of minimizing the misclassification error. Heuristic search strategies are introduced which attempt to find the model which minimizes mBIC2, and which are efficient enough to allow the analysis of GWAS data. Our approach is implemented in a software package called MOSGWA. Its performance in case control GWAS is compared with the two algorithms HLASSO and d-GWASelect, as well as with single marker tests, where we performed a simulation study based on real SNP data from the POPRES sample. Our results show that MOSGWA performs slightly better than HLASSO, where specifically for more complex models MOSGWA is more powerful with only a slight increase in Type I error. On the other hand according to our simulations GWASelect does not at all control the type I error when used to automatically determine the number of important SNPs. We also reanalyze the GWAS data from the Wellcome Trust Case-Control Consortium and compare the findings of the different procedures, where MOSGWA detects for complex diseases a number of interesting SNPs which are not found by other methods.
format Online
Article
Text
id pubmed-4111553
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41115532014-07-29 Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion Dolejsi, Erich Bodenstorfer, Bernhard Frommlet, Florian PLoS One Research Article The prevailing method of analyzing GWAS data is still to test each marker individually, although from a statistical point of view it is quite obvious that in case of complex traits such single marker tests are not ideal. Recently several model selection approaches for GWAS have been suggested, most of them based on LASSO-type procedures. Here we will discuss an alternative model selection approach which is based on a modification of the Bayesian Information Criterion (mBIC2) which was previously shown to have certain asymptotic optimality properties in terms of minimizing the misclassification error. Heuristic search strategies are introduced which attempt to find the model which minimizes mBIC2, and which are efficient enough to allow the analysis of GWAS data. Our approach is implemented in a software package called MOSGWA. Its performance in case control GWAS is compared with the two algorithms HLASSO and d-GWASelect, as well as with single marker tests, where we performed a simulation study based on real SNP data from the POPRES sample. Our results show that MOSGWA performs slightly better than HLASSO, where specifically for more complex models MOSGWA is more powerful with only a slight increase in Type I error. On the other hand according to our simulations GWASelect does not at all control the type I error when used to automatically determine the number of important SNPs. We also reanalyze the GWAS data from the Wellcome Trust Case-Control Consortium and compare the findings of the different procedures, where MOSGWA detects for complex diseases a number of interesting SNPs which are not found by other methods. Public Library of Science 2014-07-25 /pmc/articles/PMC4111553/ /pubmed/25061809 http://dx.doi.org/10.1371/journal.pone.0103322 Text en © 2014 Dolejsi et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dolejsi, Erich
Bodenstorfer, Bernhard
Frommlet, Florian
Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion
title Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion
title_full Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion
title_fullStr Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion
title_full_unstemmed Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion
title_short Analyzing Genome-Wide Association Studies with an FDR Controlling Modification of the Bayesian Information Criterion
title_sort analyzing genome-wide association studies with an fdr controlling modification of the bayesian information criterion
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111553/
https://www.ncbi.nlm.nih.gov/pubmed/25061809
http://dx.doi.org/10.1371/journal.pone.0103322
work_keys_str_mv AT dolejsierich analyzinggenomewideassociationstudieswithanfdrcontrollingmodificationofthebayesianinformationcriterion
AT bodenstorferbernhard analyzinggenomewideassociationstudieswithanfdrcontrollingmodificationofthebayesianinformationcriterion
AT frommletflorian analyzinggenomewideassociationstudieswithanfdrcontrollingmodificationofthebayesianinformationcriterion