Cargando…
Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials
Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains....
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7852533/ https://www.ncbi.nlm.nih.gov/pubmed/32855544 http://dx.doi.org/10.1038/s41437-020-00353-1 |
_version_ | 1783645841516396544 |
---|---|
author | Costa-Neto, Germano Fritsche-Neto, Roberto Crossa, José |
author_facet | Costa-Neto, Germano Fritsche-Neto, Roberto Crossa, José |
author_sort | Costa-Neto, Germano |
collection | PubMed |
description | Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype × environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (~up to 20%) under all model–kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (~up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way. |
format | Online Article Text |
id | pubmed-7852533 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-78525332021-02-08 Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials Costa-Neto, Germano Fritsche-Neto, Roberto Crossa, José Heredity (Edinb) Article Modern whole-genome prediction (WGP) frameworks that focus on multi-environment trials (MET) integrate large-scale genomics, phenomics, and envirotyping data. However, the more complex the statistical model, the longer the computational processing times, which do not always result in accuracy gains. We investigated the use of new kernel methods and modeling structures involving genomics and nongenomic sources of variation in two MET maize data sets. Five WGP models were considered, advancing in complexity from a main-effect additive model (A) to more complex structures, including dominance deviations (D), genotype × environment interaction (AE and DE), and the reaction-norm model using environmental covariables (W) and their interaction with A and D (AW + DW). A combination of those models built with three different kernel methods, Gaussian kernel (GK), Deep kernel (DK), and the benchmark genomic best linear-unbiased predictor (GBLUP/GB), was tested under three prediction scenarios: newly developed hybrids (CV1), sparse MET conditions (CV2), and new environments (CV0). GK and DK outperformed GB in prediction accuracy and reduction of computation time (~up to 20%) under all model–kernel scenarios. GK was more efficient in capturing the variation due to A + AE and D + DE effects and translated it into accuracy gains (~up to 85% compared with GB). DK provided more consistent predictions, even for more complex structures such as W + AW + DW. Our results suggest that DK and GK are more efficient in translating model complexity into accuracy, and more suitable for including dominance and reaction-norm effects in a biologically accurate and faster way. Springer International Publishing 2020-08-27 2021-01 /pmc/articles/PMC7852533/ /pubmed/32855544 http://dx.doi.org/10.1038/s41437-020-00353-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Costa-Neto, Germano Fritsche-Neto, Roberto Crossa, José Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
title | Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
title_full | Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
title_fullStr | Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
title_full_unstemmed | Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
title_short | Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
title_sort | nonlinear kernels, dominance, and envirotyping data increase the accuracy of genome-based prediction in multi-environment trials |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7852533/ https://www.ncbi.nlm.nih.gov/pubmed/32855544 http://dx.doi.org/10.1038/s41437-020-00353-1 |
work_keys_str_mv | AT costanetogermano nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials AT fritschenetoroberto nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials AT crossajose nonlinearkernelsdominanceandenvirotypingdataincreasetheaccuracyofgenomebasedpredictioninmultienvironmenttrials |