Cargando…

A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction

Prediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independ...

Descripción completa

Detalles Bibliográficos
Autores principales: Erbe, Malena, Gredler, Birgit, Seefried, Franz Reinhold, Bapst, Beat, Simianer, Henner
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3855218/
https://www.ncbi.nlm.nih.gov/pubmed/24339895
http://dx.doi.org/10.1371/journal.pone.0081046
_version_ 1782294903556931584
author Erbe, Malena
Gredler, Birgit
Seefried, Franz Reinhold
Bapst, Beat
Simianer, Henner
author_facet Erbe, Malena
Gredler, Birgit
Seefried, Franz Reinhold
Bapst, Beat
Simianer, Henner
author_sort Erbe, Malena
collection PubMed
description Prediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independent chromosome segments ([Image: see text]). The aim of our study was to find a general deterministic equation for the average accuracy of genomic breeding values that also accounts for marker density and can be fitted empirically. Two data sets of 5′698 Holstein Friesian bulls genotyped with 50 K SNPs and 1′332 Brown Swiss bulls genotyped with 50 K SNPs and imputed to ∼600 K SNPs were available. Different k-fold (k = 2–10, 15, 20) cross-validation scenarios (50 replicates, random assignment) were performed using a genomic BLUP approach. A maximum likelihood approach was used to estimate the parameters of different prediction equations. The highest likelihood was obtained when using a modified form of the deterministic equation of Daetwyler et al. (2010), augmented by a weighting factor (w) based on the assumption that the maximum achievable accuracy is [Image: see text]. The proportion of genetic variance captured by the complete SNP sets ([Image: see text]) was 0.76 to 0.82 for Holstein Friesian and 0.72 to 0.75 for Brown Swiss. When modifying the number of SNPs, w was found to be proportional to the log of the marker density up to a limit which is population and trait specific and was found to be reached with ∼20′000 SNPs in the Brown Swiss population studied.
format Online
Article
Text
id pubmed-3855218
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38552182013-12-11 A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction Erbe, Malena Gredler, Birgit Seefried, Franz Reinhold Bapst, Beat Simianer, Henner PLoS One Research Article Prediction of genomic breeding values is of major practical relevance in dairy cattle breeding. Deterministic equations have been suggested to predict the accuracy of genomic breeding values in a given design which are based on training set size, reliability of phenotypes, and the number of independent chromosome segments ([Image: see text]). The aim of our study was to find a general deterministic equation for the average accuracy of genomic breeding values that also accounts for marker density and can be fitted empirically. Two data sets of 5′698 Holstein Friesian bulls genotyped with 50 K SNPs and 1′332 Brown Swiss bulls genotyped with 50 K SNPs and imputed to ∼600 K SNPs were available. Different k-fold (k = 2–10, 15, 20) cross-validation scenarios (50 replicates, random assignment) were performed using a genomic BLUP approach. A maximum likelihood approach was used to estimate the parameters of different prediction equations. The highest likelihood was obtained when using a modified form of the deterministic equation of Daetwyler et al. (2010), augmented by a weighting factor (w) based on the assumption that the maximum achievable accuracy is [Image: see text]. The proportion of genetic variance captured by the complete SNP sets ([Image: see text]) was 0.76 to 0.82 for Holstein Friesian and 0.72 to 0.75 for Brown Swiss. When modifying the number of SNPs, w was found to be proportional to the log of the marker density up to a limit which is population and trait specific and was found to be reached with ∼20′000 SNPs in the Brown Swiss population studied. Public Library of Science 2013-12-05 /pmc/articles/PMC3855218/ /pubmed/24339895 http://dx.doi.org/10.1371/journal.pone.0081046 Text en © 2013 Erbe et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Erbe, Malena
Gredler, Birgit
Seefried, Franz Reinhold
Bapst, Beat
Simianer, Henner
A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
title A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
title_full A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
title_fullStr A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
title_full_unstemmed A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
title_short A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
title_sort function accounting for training set size and marker density to model the average accuracy of genomic prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3855218/
https://www.ncbi.nlm.nih.gov/pubmed/24339895
http://dx.doi.org/10.1371/journal.pone.0081046
work_keys_str_mv AT erbemalena afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT gredlerbirgit afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT seefriedfranzreinhold afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT bapstbeat afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT simianerhenner afunctionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT erbemalena functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT gredlerbirgit functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT seefriedfranzreinhold functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT bapstbeat functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction
AT simianerhenner functionaccountingfortrainingsetsizeandmarkerdensitytomodeltheaverageaccuracyofgenomicprediction