Cargando…

An efficient unified model for genome-wide association studies and genomic selection

BACKGROUND: A quantitative trait is controlled both by major variants with large genetic effects and by minor variants with small effects. Genome-wide association studies (GWAS) are an efficient approach to identify quantitative trait loci (QTL), and genomic selection (GS) with high-density single n...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Hengde, Su, Guosheng, Jiang, Li, Bao, Zhenmin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5569572/ https://www.ncbi.nlm.nih.gov/pubmed/28836943 http://dx.doi.org/10.1186/s12711-017-0338-x

_version_	1783259019803099136
author	Li, Hengde Su, Guosheng Jiang, Li Bao, Zhenmin
author_facet	Li, Hengde Su, Guosheng Jiang, Li Bao, Zhenmin
author_sort	Li, Hengde
collection	PubMed
description	BACKGROUND: A quantitative trait is controlled both by major variants with large genetic effects and by minor variants with small effects. Genome-wide association studies (GWAS) are an efficient approach to identify quantitative trait loci (QTL), and genomic selection (GS) with high-density single nucleotide polymorphisms (SNPs) can achieve higher accuracy of estimated breeding values than conventional best linear unbiased prediction (BLUP). GWAS and GS address different aspects of quantitative traits, but, as statistical models, they are quite similar in their description of the genetic mechanisms that underlie quantitative traits. METHODS: Here, we propose a stepwise linear regression mixed model (StepLMM) to unify GWAS and GS in a single statistical model. First, the variance components of the genomic-BLUP (GBLUP) model are estimated. Then, in the SNP selection step, the linear mixed model (LMM) for GWAS is equivalently transformed into a simple linear regression to improve computation speed, and the most significant SNP is selected and included into the evaluation model. In the SNP dropping step, the SNPs in the evaluation model are tested according to the standard errors of their estimated effects. If non-significant SNPs are present, the least significant one is dropped from the model and variance components are re-estimated. We used extended Bayesian information criteria (eBIC) to evaluate the model optimization, i.e. the model with the smallest eBIC is the final one and includes only significant SNPs. RESULTS: We simulated scenarios with different heritabilities with 100 QTL. StepLMM estimated heritability accurately and mapped QTL precisely. Genomic prediction accuracy was much higher with StepLMM than with GBLUP. The comparison of StepLMM with other GWAS and GS methods based on a dataset from the 16th QTLMAS Workshop showed that StepLMM had medium mapping power, the lowest rate of false positives for QTL mapping, and the highest accuracy for genomic prediction. CONCLUSIONS: StepLMM is a combination of GWAS and GBLUP. GWAS and GBLUP are beneficial to each other in a single statistical model, GWAS improves genomic prediction accuracy, while GBLUP increases mapping precision and decreases the rate of false positives of GWAS. StepLMM has a high performance in both GWAS and GS and is feasible for agricultural breeding programs and human genetic studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12711-017-0338-x) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5569572
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-55695722017-08-29 An efficient unified model for genome-wide association studies and genomic selection Li, Hengde Su, Guosheng Jiang, Li Bao, Zhenmin Genet Sel Evol Research Article BACKGROUND: A quantitative trait is controlled both by major variants with large genetic effects and by minor variants with small effects. Genome-wide association studies (GWAS) are an efficient approach to identify quantitative trait loci (QTL), and genomic selection (GS) with high-density single nucleotide polymorphisms (SNPs) can achieve higher accuracy of estimated breeding values than conventional best linear unbiased prediction (BLUP). GWAS and GS address different aspects of quantitative traits, but, as statistical models, they are quite similar in their description of the genetic mechanisms that underlie quantitative traits. METHODS: Here, we propose a stepwise linear regression mixed model (StepLMM) to unify GWAS and GS in a single statistical model. First, the variance components of the genomic-BLUP (GBLUP) model are estimated. Then, in the SNP selection step, the linear mixed model (LMM) for GWAS is equivalently transformed into a simple linear regression to improve computation speed, and the most significant SNP is selected and included into the evaluation model. In the SNP dropping step, the SNPs in the evaluation model are tested according to the standard errors of their estimated effects. If non-significant SNPs are present, the least significant one is dropped from the model and variance components are re-estimated. We used extended Bayesian information criteria (eBIC) to evaluate the model optimization, i.e. the model with the smallest eBIC is the final one and includes only significant SNPs. RESULTS: We simulated scenarios with different heritabilities with 100 QTL. StepLMM estimated heritability accurately and mapped QTL precisely. Genomic prediction accuracy was much higher with StepLMM than with GBLUP. The comparison of StepLMM with other GWAS and GS methods based on a dataset from the 16th QTLMAS Workshop showed that StepLMM had medium mapping power, the lowest rate of false positives for QTL mapping, and the highest accuracy for genomic prediction. CONCLUSIONS: StepLMM is a combination of GWAS and GBLUP. GWAS and GBLUP are beneficial to each other in a single statistical model, GWAS improves genomic prediction accuracy, while GBLUP increases mapping precision and decreases the rate of false positives of GWAS. StepLMM has a high performance in both GWAS and GS and is feasible for agricultural breeding programs and human genetic studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12711-017-0338-x) contains supplementary material, which is available to authorized users. BioMed Central 2017-08-24 /pmc/articles/PMC5569572/ /pubmed/28836943 http://dx.doi.org/10.1186/s12711-017-0338-x Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Li, Hengde Su, Guosheng Jiang, Li Bao, Zhenmin An efficient unified model for genome-wide association studies and genomic selection
title	An efficient unified model for genome-wide association studies and genomic selection
title_full	An efficient unified model for genome-wide association studies and genomic selection
title_fullStr	An efficient unified model for genome-wide association studies and genomic selection
title_full_unstemmed	An efficient unified model for genome-wide association studies and genomic selection
title_short	An efficient unified model for genome-wide association studies and genomic selection
title_sort	efficient unified model for genome-wide association studies and genomic selection
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5569572/ https://www.ncbi.nlm.nih.gov/pubmed/28836943 http://dx.doi.org/10.1186/s12711-017-0338-x
work_keys_str_mv	AT lihengde anefficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT suguosheng anefficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT jiangli anefficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT baozhenmin anefficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT lihengde efficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT suguosheng efficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT jiangli efficientunifiedmodelforgenomewideassociationstudiesandgenomicselection AT baozhenmin efficientunifiedmodelforgenomewideassociationstudiesandgenomicselection

An efficient unified model for genome-wide association studies and genomic selection

Ejemplares similares