Cargando…
Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies
Genome-wide association study (GWAS) entails examining a large number of single nucleotide polymorphisms (SNPs) in a limited sample with hundreds of individuals, implying a variable selection problem in the high dimensional dataset. Although many single-locus GWAS approaches under polygenic backgrou...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5308866/ https://www.ncbi.nlm.nih.gov/pubmed/28141824 http://dx.doi.org/10.1371/journal.pcbi.1005357 |
Sumario: | Genome-wide association study (GWAS) entails examining a large number of single nucleotide polymorphisms (SNPs) in a limited sample with hundreds of individuals, implying a variable selection problem in the high dimensional dataset. Although many single-locus GWAS approaches under polygenic background and population structure controls have been widely used, some significant loci fail to be detected. In this study, we used an iterative modified-sure independence screening (ISIS) approach in reducing the number of SNPs to a moderate size. Expectation-Maximization (EM)-Bayesian least absolute shrinkage and selection operator (BLASSO) was used to estimate all the selected SNP effects for true quantitative trait nucleotide (QTN) detection. This method is referred to as ISIS EM-BLASSO algorithm. Monte Carlo simulation studies validated the new method, which has the highest empirical power in QTN detection and the highest accuracy in QTN effect estimation, and it is the fastest, as compared with efficient mixed-model association (EMMA), smoothly clipped absolute deviation (SCAD), fixed and random model circulating probability unification (FarmCPU), and multi-locus random-SNP-effect mixed linear model (mrMLM). To further demonstrate the new method, six flowering time traits in Arabidopsis thaliana were re-analyzed by four methods (New method, EMMA, FarmCPU, and mrMLM). As a result, the new method identified most previously reported genes. Therefore, the new method is a good alternative for multi-locus GWAS. |
---|