Cargando…
Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations
KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic informat...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5641287/ https://www.ncbi.nlm.nih.gov/pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7 |
_version_ | 1783271191929159680 |
---|---|
author | Rincent, R. Charcosset, A. Moreau, L. |
author_facet | Rincent, R. Charcosset, A. Moreau, L. |
author_sort | Rincent, R. |
collection | PubMed |
description | KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-017-2956-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5641287 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-56412872017-10-26 Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations Rincent, R. Charcosset, A. Moreau, L. Theor Appl Genet Original Article KEY MESSAGE: We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. ABSTRACT: Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-017-2956-7) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2017-08-09 2017 /pmc/articles/PMC5641287/ /pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
spellingShingle | Original Article Rincent, R. Charcosset, A. Moreau, L. Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
title | Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
title_full | Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
title_fullStr | Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
title_full_unstemmed | Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
title_short | Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
title_sort | predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5641287/ https://www.ncbi.nlm.nih.gov/pubmed/28795202 http://dx.doi.org/10.1007/s00122-017-2956-7 |
work_keys_str_mv | AT rincentr predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations AT charcosseta predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations AT moreaul predictinggenomicselectionefficiencytooptimizecalibrationsetandtoassesspredictionaccuracyinhighlystructuredpopulations |