Cargando…

Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials

Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains....

Descripción completa

Detalles Bibliográficos
Autores principales: Costa-Neto, Germano, Fritsche-Neto, Roberto, Crossa, José
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7852533/
https://www.ncbi.nlm.nih.gov/pubmed/32855544
http://dx.doi.org/10.1038/s41437-020-00353-1
_version_ 1783645841516396544
author Costa-Neto, Germano
Fritsche-Neto, Roberto
Crossa, José
author_facet Costa-Neto, Germano
Fritsche-Neto, Roberto
Crossa, José
author_sort Costa-Neto, Germano
collection PubMed
description Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype × environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (~up to 20%) under all model–kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (~up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way.
format Online
Article
Text
id pubmed-7852533
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-78525332021-02-08 Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials Costa-Neto, Germano Fritsche-Neto, Roberto Crossa, José Heredity (Edinb) Article Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype × environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (~up to 20%) under all model–kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (~up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way. Springer International Publishing 2020-08-27 2021-01 /pmc/articles/PMC7852533/ /pubmed/32855544 http://dx.doi.org/10.1038/s41437-020-00353-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Costa-Neto, Germano
Fritsche-Neto, Roberto
Crossa, José
Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_full Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_fullStr Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_full_unstemmed Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_short Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
title_sort nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7852533/
https://www.ncbi.nlm.nih.gov/pubmed/32855544
http://dx.doi.org/10.1038/s41437-020-00353-1
work_keys_str_mv AT costanetogermano nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials
AT fritschenetoroberto nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials
AT crossajose nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials