Cargando…

Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space

Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bustos-Korts, Daniela, Malosetti, Marcos, Chapman, Scott, Biddulph, Ben, van Eeuwijk, Fred
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Genetics Society of America 2016
Materias:	Genomic Selection
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100872/ https://www.ncbi.nlm.nih.gov/pubmed/27672112 http://dx.doi.org/10.1534/g3.116.035410

_version_	1782466206769348608
author	Bustos-Korts, Daniela Malosetti, Marcos Chapman, Scott Biddulph, Ben van Eeuwijk, Fred
author_facet	Bustos-Korts, Daniela Malosetti, Marcos Chapman, Scott Biddulph, Ben van Eeuwijk, Fred
author_sort	Bustos-Korts, Daniela
collection	PubMed
description	Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of genotypes, where the calibration set is composed of a training and validation set. A random sampling protocol of genotypes from the calibration set will lead to low quality coverage of the total genetic space by the training set when the calibration set contains population structure. As a consequence, predictive ability will be affected negatively, because some parts of the genotypic diversity in the target population will be under-represented in the training set, whereas other parts will be over-represented. Therefore, we propose a training set construction method that uniformly samples the genetic space spanned by the target population of genotypes, thereby increasing predictive ability. To evaluate our method, we constructed training sets alongside with the identification of corresponding genomic prediction models for four genotype panels that differed in the amount of population structure they contained (maize Flint, maize Dent, wheat, and rice). Training sets were constructed using uniform sampling, stratified-uniform sampling, stratified sampling and random sampling. We compared these methods with a method that maximizes the generalized coefficient of determination (CD). Several training set sizes were considered. We investigated four genomic prediction models: multi-locus QTL models, GBLUP models, combinations of QTL and GBLUPs, and Reproducing Kernel Hilbert Space (RKHS) models. For the maize and wheat panels, construction of the training set under uniform sampling led to a larger predictive ability than under stratified and random sampling. The results of our methods were similar to those of the CD method. For the rice panel, all training set construction methods led to similar predictive ability, a reflection of the very strong population structure in this panel.
format	Online Article Text
id	pubmed-5100872
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Genetics Society of America
record_format	MEDLINE/PubMed
spelling	pubmed-51008722016-11-09 Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space Bustos-Korts, Daniela Malosetti, Marcos Chapman, Scott Biddulph, Ben van Eeuwijk, Fred G3 (Bethesda) Genomic Selection Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of genotypes, where the calibration set is composed of a training and validation set. A random sampling protocol of genotypes from the calibration set will lead to low quality coverage of the total genetic space by the training set when the calibration set contains population structure. As a consequence, predictive ability will be affected negatively, because some parts of the genotypic diversity in the target population will be under-represented in the training set, whereas other parts will be over-represented. Therefore, we propose a training set construction method that uniformly samples the genetic space spanned by the target population of genotypes, thereby increasing predictive ability. To evaluate our method, we constructed training sets alongside with the identification of corresponding genomic prediction models for four genotype panels that differed in the amount of population structure they contained (maize Flint, maize Dent, wheat, and rice). Training sets were constructed using uniform sampling, stratified-uniform sampling, stratified sampling and random sampling. We compared these methods with a method that maximizes the generalized coefficient of determination (CD). Several training set sizes were considered. We investigated four genomic prediction models: multi-locus QTL models, GBLUP models, combinations of QTL and GBLUPs, and Reproducing Kernel Hilbert Space (RKHS) models. For the maize and wheat panels, construction of the training set under uniform sampling led to a larger predictive ability than under stratified and random sampling. The results of our methods were similar to those of the CD method. For the rice panel, all training set construction methods led to similar predictive ability, a reflection of the very strong population structure in this panel. Genetics Society of America 2016-09-22 /pmc/articles/PMC5100872/ /pubmed/27672112 http://dx.doi.org/10.1534/g3.116.035410 Text en Copyright © 2016 Bustos-Korts et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Genomic Selection Bustos-Korts, Daniela Malosetti, Marcos Chapman, Scott Biddulph, Ben van Eeuwijk, Fred Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
title	Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
title_full	Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
title_fullStr	Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
title_full_unstemmed	Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
title_short	Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
title_sort	improvement of predictive ability by uniform coverage of the target genetic space
topic	Genomic Selection
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100872/ https://www.ncbi.nlm.nih.gov/pubmed/27672112 http://dx.doi.org/10.1534/g3.116.035410
work_keys_str_mv	AT bustoskortsdaniela improvementofpredictiveabilitybyuniformcoverageofthetargetgeneticspace AT malosettimarcos improvementofpredictiveabilitybyuniformcoverageofthetargetgeneticspace AT chapmanscott improvementofpredictiveabilitybyuniformcoverageofthetargetgeneticspace AT biddulphben improvementofpredictiveabilitybyuniformcoverageofthetargetgeneticspace AT vaneeuwijkfred improvementofpredictiveabilitybyuniformcoverageofthetargetgeneticspace

Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space

Ejemplares similares