Cargando…

Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations

BACKGROUND: Accuracy of genomic prediction depends on number of records in the training population, heritability, effective population size, genetic architecture, and relatedness of training and validation populations. Many traits have ordered categories including reproductive performance and suscep...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kizilkaya, Kadir, Fernando, Rohan L, Garrick, Dorian J
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4094927/ https://www.ncbi.nlm.nih.gov/pubmed/24912924 http://dx.doi.org/10.1186/1297-9686-46-37

_version_	1782325923977101312
author	Kizilkaya, Kadir Fernando, Rohan L Garrick, Dorian J
author_facet	Kizilkaya, Kadir Fernando, Rohan L Garrick, Dorian J
author_sort	Kizilkaya, Kadir
collection	PubMed
description	BACKGROUND: Accuracy of genomic prediction depends on number of records in the training population, heritability, effective population size, genetic architecture, and relatedness of training and validation populations. Many traits have ordered categories including reproductive performance and susceptibility or resistance to disease. Categorical scores are often recorded because they are easier to obtain than continuous observations. Bayesian linear regression has been extended to the threshold model for genomic prediction. The objective of this study was to quantify reductions in accuracy for ordinal categorical traits relative to continuous traits. METHODS: Efficiency of genomic prediction was evaluated for heritabilities of 0.10, 0.25 or 0.50. Phenotypes were simulated for 2250 purebred animals using 50 QTL selected from actual 50k SNP (single nucleotide polymorphism) genotypes giving a proportion of causal to total loci of.0001. A Bayes C π threshold model simultaneously fitted all 50k markers except those that represented QTL. Estimated SNP effects were utilized to predict genomic breeding values in purebred (n = 239) or multibreed (n = 924) validation populations. Correlations between true and predicted genomic merit in validation populations were used to assess predictive ability. RESULTS: Accuracies of genomic estimated breeding values ranged from 0.12 to 0.66 for purebred and from 0.04 to 0.53 for multibreed validation populations based on Bayes C π linear model analysis of the simulated underlying variable. Accuracies for ordinal categorical scores analyzed by the Bayes C π threshold model were 20% to 50% lower and ranged from 0.04 to 0.55 for purebred and from 0.01 to 0.44 for multibreed validation populations. Analysis of ordinal categorical scores using a linear model resulted in further reductions in accuracy. CONCLUSIONS: Threshold traits result in markedly lower accuracy than a linear model on the underlying variable. To achieve an accuracy equal or greater than for continuous phenotypes with a training population of 1000 animals, a 2.25 fold increase in training population size was required for categorical scores fitted with the threshold model. The threshold model resulted in higher accuracies than the linear model and its advantage was greatest when training populations were smallest.
format	Online Article Text
id	pubmed-4094927
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-40949272014-07-23 Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations Kizilkaya, Kadir Fernando, Rohan L Garrick, Dorian J Genet Sel Evol Research BACKGROUND: Accuracy of genomic prediction depends on number of records in the training population, heritability, effective population size, genetic architecture, and relatedness of training and validation populations. Many traits have ordered categories including reproductive performance and susceptibility or resistance to disease. Categorical scores are often recorded because they are easier to obtain than continuous observations. Bayesian linear regression has been extended to the threshold model for genomic prediction. The objective of this study was to quantify reductions in accuracy for ordinal categorical traits relative to continuous traits. METHODS: Efficiency of genomic prediction was evaluated for heritabilities of 0.10, 0.25 or 0.50. Phenotypes were simulated for 2250 purebred animals using 50 QTL selected from actual 50k SNP (single nucleotide polymorphism) genotypes giving a proportion of causal to total loci of.0001. A Bayes C π threshold model simultaneously fitted all 50k markers except those that represented QTL. Estimated SNP effects were utilized to predict genomic breeding values in purebred (n = 239) or multibreed (n = 924) validation populations. Correlations between true and predicted genomic merit in validation populations were used to assess predictive ability. RESULTS: Accuracies of genomic estimated breeding values ranged from 0.12 to 0.66 for purebred and from 0.04 to 0.53 for multibreed validation populations based on Bayes C π linear model analysis of the simulated underlying variable. Accuracies for ordinal categorical scores analyzed by the Bayes C π threshold model were 20% to 50% lower and ranged from 0.04 to 0.55 for purebred and from 0.01 to 0.44 for multibreed validation populations. Analysis of ordinal categorical scores using a linear model resulted in further reductions in accuracy. CONCLUSIONS: Threshold traits result in markedly lower accuracy than a linear model on the underlying variable. To achieve an accuracy equal or greater than for continuous phenotypes with a training population of 1000 animals, a 2.25 fold increase in training population size was required for categorical scores fitted with the threshold model. The threshold model resulted in higher accuracies than the linear model and its advantage was greatest when training populations were smallest. BioMed Central 2014-06-09 /pmc/articles/PMC4094927/ /pubmed/24912924 http://dx.doi.org/10.1186/1297-9686-46-37 Text en Copyright © 2014 Kizilkaya et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle	Research Kizilkaya, Kadir Fernando, Rohan L Garrick, Dorian J Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
title	Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
title_full	Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
title_fullStr	Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
title_full_unstemmed	Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
title_short	Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
title_sort	reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4094927/ https://www.ncbi.nlm.nih.gov/pubmed/24912924 http://dx.doi.org/10.1186/1297-9686-46-37
work_keys_str_mv	AT kizilkayakadir reductioninaccuracyofgenomicpredictionfororderedcategoricaldatacomparedtocontinuousobservations AT fernandorohanl reductioninaccuracyofgenomicpredictionfororderedcategoricaldatacomparedtocontinuousobservations AT garrickdorianj reductioninaccuracyofgenomicpredictionfororderedcategoricaldatacomparedtocontinuousobservations

Reduction in accuracy of genomic prediction for ordered categorical data compared to continuous observations

Ejemplares similares