Cargando…

Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals

BACKGROUND: Two types of models have been used for single-step genomic prediction and genome-wide association studies that include phenotypes from both genotyped animals and their non-genotyped relatives. The two types are breeding value models (BVM) that fit breeding values explicitly and marker ef...

Descripción completa

Detalles Bibliográficos
Autores principales: Fernando, Rohan L., Cheng, Hao, Golden, Bruce L., Garrick, Dorian J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5144523/
https://www.ncbi.nlm.nih.gov/pubmed/27931187
http://dx.doi.org/10.1186/s12711-016-0273-2
_version_ 1782473141937766400
author Fernando, Rohan L.
Cheng, Hao
Golden, Bruce L.
Garrick, Dorian J.
author_facet Fernando, Rohan L.
Cheng, Hao
Golden, Bruce L.
Garrick, Dorian J.
author_sort Fernando, Rohan L.
collection PubMed
description BACKGROUND: Two types of models have been used for single-step genomic prediction and genome-wide association studies that include phenotypes from both genotyped animals and their non-genotyped relatives. The two types are breeding value models (BVM) that fit breeding values explicitly and marker effects models (MEM) that express the breeding values in terms of the effects of observed or imputed genotypes. MEM can accommodate a wider class of analyses, including variable selection or mixture model analyses. The order of the equations that need to be solved and the inverses required in their construction vary widely, and thus the computational effort required depends upon the size of the pedigree, the number of genotyped animals and the number of loci. THEORY: We present computational strategies to avoid storing large, dense blocks of the MME that involve imputed genotypes. Furthermore, we present a hybrid model that fits a MEM for animals with observed genotypes and a BVM for those without genotypes. The hybrid model is computationally attractive for pedigree files containing millions of animals with a large proportion of those being genotyped. APPLICATION: We demonstrate the practicality on both the original MEM and the hybrid model using real data with 6,179,960 animals in the pedigree with 4,934,101 phenotypes and 31,453 animals genotyped at 40,214 informative loci. To complete a single-trait analysis on a desk-top computer with four graphics cards required about 3 h using the hybrid model to obtain both preconditioned conjugate gradient solutions and 42,000 Markov chain Monte-Carlo (MCMC) samples of breeding values, which allowed making inferences from posterior means, variances and covariances. The MCMC sampling required one quarter of the effort when the hybrid model was used compared to the published MEM. CONCLUSIONS: We present a hybrid model that fits a MEM for animals with genotypes and a BVM for those without genotypes. Its practicality and considerable reduction in computing effort was demonstrated. This model can readily be extended to accommodate multiple traits, multiple breeds, maternal effects, and additional random effects such as polygenic residual effects.
format Online
Article
Text
id pubmed-5144523
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51445232016-12-15 Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals Fernando, Rohan L. Cheng, Hao Golden, Bruce L. Garrick, Dorian J. Genet Sel Evol Research Article BACKGROUND: Two types of models have been used for single-step genomic prediction and genome-wide association studies that include phenotypes from both genotyped animals and their non-genotyped relatives. The two types are breeding value models (BVM) that fit breeding values explicitly and marker effects models (MEM) that express the breeding values in terms of the effects of observed or imputed genotypes. MEM can accommodate a wider class of analyses, including variable selection or mixture model analyses. The order of the equations that need to be solved and the inverses required in their construction vary widely, and thus the computational effort required depends upon the size of the pedigree, the number of genotyped animals and the number of loci. THEORY: We present computational strategies to avoid storing large, dense blocks of the MME that involve imputed genotypes. Furthermore, we present a hybrid model that fits a MEM for animals with observed genotypes and a BVM for those without genotypes. The hybrid model is computationally attractive for pedigree files containing millions of animals with a large proportion of those being genotyped. APPLICATION: We demonstrate the practicality on both the original MEM and the hybrid model using real data with 6,179,960 animals in the pedigree with 4,934,101 phenotypes and 31,453 animals genotyped at 40,214 informative loci. To complete a single-trait analysis on a desk-top computer with four graphics cards required about 3 h using the hybrid model to obtain both preconditioned conjugate gradient solutions and 42,000 Markov chain Monte-Carlo (MCMC) samples of breeding values, which allowed making inferences from posterior means, variances and covariances. The MCMC sampling required one quarter of the effort when the hybrid model was used compared to the published MEM. CONCLUSIONS: We present a hybrid model that fits a MEM for animals with genotypes and a BVM for those without genotypes. Its practicality and considerable reduction in computing effort was demonstrated. This model can readily be extended to accommodate multiple traits, multiple breeds, maternal effects, and additional random effects such as polygenic residual effects. BioMed Central 2016-12-08 /pmc/articles/PMC5144523/ /pubmed/27931187 http://dx.doi.org/10.1186/s12711-016-0273-2 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Fernando, Rohan L.
Cheng, Hao
Golden, Bruce L.
Garrick, Dorian J.
Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals
title Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals
title_full Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals
title_fullStr Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals
title_full_unstemmed Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals
title_short Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals
title_sort computational strategies for alternative single-step bayesian regression models with large numbers of genotyped and non-genotyped animals
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5144523/
https://www.ncbi.nlm.nih.gov/pubmed/27931187
http://dx.doi.org/10.1186/s12711-016-0273-2
work_keys_str_mv AT fernandorohanl computationalstrategiesforalternativesinglestepbayesianregressionmodelswithlargenumbersofgenotypedandnongenotypedanimals
AT chenghao computationalstrategiesforalternativesinglestepbayesianregressionmodelswithlargenumbersofgenotypedandnongenotypedanimals
AT goldenbrucel computationalstrategiesforalternativesinglestepbayesianregressionmodelswithlargenumbersofgenotypedandnongenotypedanimals
AT garrickdorianj computationalstrategiesforalternativesinglestepbayesianregressionmodelswithlargenumbersofgenotypedandnongenotypedanimals