Cargando…

Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations

KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic informat...

Descripción completa

Detalles Bibliográficos
Autores principales: Rincent, R., Charcosset, A., Moreau, L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5641287/
https://www.ncbi.nlm.nih.gov/pubmed/28795202
http://dx.doi.org/10.1007/s00122-017-2956-7
_version_ 1783271191929159680
author Rincent, R.
Charcosset, A.
Moreau, L.
author_facet Rincent, R.
Charcosset, A.
Moreau, L.
author_sort Rincent, R.
collection PubMed
description KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-017-2956-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5641287
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-56412872017-10-26 Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations Rincent, R. Charcosset, A. Moreau, L. Theor Appl Genet Original Article KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-017-2956-7) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2017-08-09 2017 /pmc/articles/PMC5641287/ /pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Original Article
Rincent, R.
Charcosset, A.
Moreau, L.
Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_full Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_fullStr Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_full_unstemmed Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_short Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
title_sort predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5641287/
https://www.ncbi.nlm.nih.gov/pubmed/28795202
http://dx.doi.org/10.1007/s00122-017-2956-7
work_keys_str_mv AT rincentr predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations
AT charcosseta predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations
AT moreaul predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations