Cargando…

Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations

KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic informat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rincent, R., Charcosset, A., Moreau, L.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Berlin Heidelberg 2017
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5641287/ https://www.ncbi.nlm.nih.gov/pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7

_version_	1783271191929159680
author	Rincent, R. Charcosset, A. Moreau, L.
author_facet	Rincent, R. Charcosset, A. Moreau, L.
author_sort	Rincent, R.
collection	PubMed
description	KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-017-2956-7) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5641287
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Springer Berlin Heidelberg
record_format	MEDLINE/PubMed
spelling	pubmed-56412872017-10-26 Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations Rincent, R. Charcosset, A. Moreau, L. Theor Appl Genet Original Article KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-017-2956-7) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2017-08-09 2017 /pmc/articles/PMC5641287/ /pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Original Article Rincent, R. Charcosset, A. Moreau, L. Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title	Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_full	Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_fullStr	Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_full_unstemmed	Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_short	Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_sort	predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5641287/ https://www.ncbi.nlm.nih.gov/pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7
work_keys_str_mv	AT rincentr predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations AT charcosseta predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations AT moreaul predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations

Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations

Ejemplares similares