Cargando…

Using the Pareto principle in genome-wide breeding value estimation

Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexibl...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Xijiang, Meuwissen, Theo HE
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3354342/
https://www.ncbi.nlm.nih.gov/pubmed/22044555
http://dx.doi.org/10.1186/1297-9686-43-35
_version_ 1782233200890740736
author Yu, Xijiang
Meuwissen, Theo HE
author_facet Yu, Xijiang
Meuwissen, Theo HE
author_sort Yu, Xijiang
collection PubMed
description Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis.
format Online
Article
Text
id pubmed-3354342
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33543422012-05-18 Using the Pareto principle in genome-wide breeding value estimation Yu, Xijiang Meuwissen, Theo HE Genet Sel Evol Research Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis. BioMed Central 2011-11-01 /pmc/articles/PMC3354342/ /pubmed/22044555 http://dx.doi.org/10.1186/1297-9686-43-35 Text en Copyright ©2011 Yu and Meuwissen; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Yu, Xijiang
Meuwissen, Theo HE
Using the Pareto principle in genome-wide breeding value estimation
title Using the Pareto principle in genome-wide breeding value estimation
title_full Using the Pareto principle in genome-wide breeding value estimation
title_fullStr Using the Pareto principle in genome-wide breeding value estimation
title_full_unstemmed Using the Pareto principle in genome-wide breeding value estimation
title_short Using the Pareto principle in genome-wide breeding value estimation
title_sort using the pareto principle in genome-wide breeding value estimation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3354342/
https://www.ncbi.nlm.nih.gov/pubmed/22044555
http://dx.doi.org/10.1186/1297-9686-43-35
work_keys_str_mv AT yuxijiang usingtheparetoprincipleingenomewidebreedingvalueestimation
AT meuwissentheohe usingtheparetoprincipleingenomewidebreedingvalueestimation