Cargando…

Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data

We consider the application of Efron’s empirical Bayes classification method to risk prediction in a genome-wide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empi...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Gengxin, Ferguson, John, Zheng, Wei, Lee, Joon Sang, Zhang, Xianghua, Li, Lun, Kang, Jia, Yan, Xiting, Zhao, Hongyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287883/
https://www.ncbi.nlm.nih.gov/pubmed/22373389
http://dx.doi.org/10.1186/1753-6561-5-S9-S46
_version_ 1782224765325410304
author Li, Gengxin
Ferguson, John
Zheng, Wei
Lee, Joon Sang
Zhang, Xianghua
Li, Lun
Kang, Jia
Yan, Xiting
Zhao, Hongyu
author_facet Li, Gengxin
Ferguson, John
Zheng, Wei
Lee, Joon Sang
Zhang, Xianghua
Li, Lun
Kang, Jia
Yan, Xiting
Zhao, Hongyu
author_sort Li, Gengxin
collection PubMed
description We consider the application of Efron’s empirical Bayes classification method to risk prediction in a genome-wide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empirically estimated and that all subsequent parameter estimation and risk prediction is guided by this distribution. Here, we generalize Efron’s method to allow for some of the peculiarities of the GAW17 data. In particular, we introduce two ways to extend Efron’s model: a weighted empirical Bayes model and a joint covariance model that allows the model to properly incorporate the annotation information of single-nucleotide polymorphisms (SNPs). In the course of our analysis, we examine several aspects of the possible simulation model, including the identity of the most important genes, the differing effects of synonymous and nonsynonymous SNPs, and the relative roles of covariates and genes in conferring disease risk. Finally, we compare the three methods to each other and to other classifiers (random forest and neural network).
format Online
Article
Text
id pubmed-3287883
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32878832012-02-28 Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data Li, Gengxin Ferguson, John Zheng, Wei Lee, Joon Sang Zhang, Xianghua Li, Lun Kang, Jia Yan, Xiting Zhao, Hongyu BMC Proc Proceedings We consider the application of Efron’s empirical Bayes classification method to risk prediction in a genome-wide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empirically estimated and that all subsequent parameter estimation and risk prediction is guided by this distribution. Here, we generalize Efron’s method to allow for some of the peculiarities of the GAW17 data. In particular, we introduce two ways to extend Efron’s model: a weighted empirical Bayes model and a joint covariance model that allows the model to properly incorporate the annotation information of single-nucleotide polymorphisms (SNPs). In the course of our analysis, we examine several aspects of the possible simulation model, including the identity of the most important genes, the differing effects of synonymous and nonsynonymous SNPs, and the relative roles of covariates and genes in conferring disease risk. Finally, we compare the three methods to each other and to other classifiers (random forest and neural network). BioMed Central 2011-11-29 /pmc/articles/PMC3287883/ /pubmed/22373389 http://dx.doi.org/10.1186/1753-6561-5-S9-S46 Text en Copyright ©2011 Li et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Li, Gengxin
Ferguson, John
Zheng, Wei
Lee, Joon Sang
Zhang, Xianghua
Li, Lun
Kang, Jia
Yan, Xiting
Zhao, Hongyu
Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
title Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
title_full Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
title_fullStr Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
title_full_unstemmed Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
title_short Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data
title_sort large-scale risk prediction applied to genetic analysis workshop 17 mini-exome sequence data
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287883/
https://www.ncbi.nlm.nih.gov/pubmed/22373389
http://dx.doi.org/10.1186/1753-6561-5-S9-S46
work_keys_str_mv AT ligengxin largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT fergusonjohn largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT zhengwei largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT leejoonsang largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT zhangxianghua largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT lilun largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT kangjia largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT yanxiting largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata
AT zhaohongyu largescaleriskpredictionappliedtogeneticanalysisworkshop17miniexomesequencedata