Cargando…

Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials

Deep learning (DL) is a promising method for genomic-enabled prediction. However, the implementation of DL is difficult because many hyperparameters (number of hidden layers, number of neurons, learning rate, number of epochs, batch size, etc.) need to be tuned. For this reason, deep kernel methods,...

Descripción completa

Detalles Bibliográficos
Autores principales: Crossa, José, Martini, Johannes W.R., Gianola, Daniel, Pérez-Rodríguez, Paulino, Jarquin, Diego, Juliana, Philomin, Montesinos-López, Osval, Cuevas, Jaime
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6913188/
https://www.ncbi.nlm.nih.gov/pubmed/31921277
http://dx.doi.org/10.3389/fgene.2019.01168
_version_ 1783479616024870912
author Crossa, José
Martini, Johannes W.R.
Gianola, Daniel
Pérez-Rodríguez, Paulino
Jarquin, Diego
Juliana, Philomin
Montesinos-López, Osval
Cuevas, Jaime
author_facet Crossa, José
Martini, Johannes W.R.
Gianola, Daniel
Pérez-Rodríguez, Paulino
Jarquin, Diego
Juliana, Philomin
Montesinos-López, Osval
Cuevas, Jaime
author_sort Crossa, José
collection PubMed
description Deep learning (DL) is a promising method for genomic-enabled prediction. However, the implementation of DL is difficult because many hyperparameters (number of hidden layers, number of neurons, learning rate, number of epochs, batch size, etc.) need to be tuned. For this reason, deep kernel methods, which only require defining the number of layers, may be an attractive alternative. Deep kernel methods emulate DL models with a large number of neurons, but are defined by relatively easily computed covariance matrices. In this research, we compared the genome-based prediction of DL to a deep kernel (arc-cosine kernel, AK), to the commonly used non-additive Gaussian kernel (GK), as well as to the conventional additive genomic best linear unbiased predictor (GBLUP/GB). We used two real wheat data sets for benchmarking these methods. On average, AK and GK outperformed DL and GB. The gain in terms of prediction performance of AK and GK over DL and GB was not large, but AK and GK have the advantage that only one parameter, the number of layers (AK) or the bandwidth parameter (GK), has to be tuned in each method. Furthermore, although AK and GK had similar performance, deep kernel AK is easier to implement than GK, since the parameter “number of layers” is more easily determined than the bandwidth parameter of GK. Comparing AK and DL for the data set of year 2015–2016, the difference in performance of the two methods was bigger, with AK predicting much better than DL. On this data, the optimization of the hyperparameters for DL was difficult and the finally used parameters may have been suboptimal. Our results suggest that AK is a good alternative to DL with the advantage that practically no tuning process is required.
format Online
Article
Text
id pubmed-6913188
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-69131882020-01-09 Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials Crossa, José Martini, Johannes W.R. Gianola, Daniel Pérez-Rodríguez, Paulino Jarquin, Diego Juliana, Philomin Montesinos-López, Osval Cuevas, Jaime Front Genet Genetics Deep learning (DL) is a promising method for genomic-enabled prediction. However, the implementation of DL is difficult because many hyperparameters (number of hidden layers, number of neurons, learning rate, number of epochs, batch size, etc.) need to be tuned. For this reason, deep kernel methods, which only require defining the number of layers, may be an attractive alternative. Deep kernel methods emulate DL models with a large number of neurons, but are defined by relatively easily computed covariance matrices. In this research, we compared the genome-based prediction of DL to a deep kernel (arc-cosine kernel, AK), to the commonly used non-additive Gaussian kernel (GK), as well as to the conventional additive genomic best linear unbiased predictor (GBLUP/GB). We used two real wheat data sets for benchmarking these methods. On average, AK and GK outperformed DL and GB. The gain in terms of prediction performance of AK and GK over DL and GB was not large, but AK and GK have the advantage that only one parameter, the number of layers (AK) or the bandwidth parameter (GK), has to be tuned in each method. Furthermore, although AK and GK had similar performance, deep kernel AK is easier to implement than GK, since the parameter “number of layers” is more easily determined than the bandwidth parameter of GK. Comparing AK and DL for the data set of year 2015–2016, the difference in performance of the two methods was bigger, with AK predicting much better than DL. On this data, the optimization of the hyperparameters for DL was difficult and the finally used parameters may have been suboptimal. Our results suggest that AK is a good alternative to DL with the advantage that practically no tuning process is required. Frontiers Media S.A. 2019-12-09 /pmc/articles/PMC6913188/ /pubmed/31921277 http://dx.doi.org/10.3389/fgene.2019.01168 Text en Copyright © 2019 Crossa, Martini, Gianola, Pérez-Rodríguez, Jarquin, Juliana, Montesinos-López and Cuevas http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Crossa, José
Martini, Johannes W.R.
Gianola, Daniel
Pérez-Rodríguez, Paulino
Jarquin, Diego
Juliana, Philomin
Montesinos-López, Osval
Cuevas, Jaime
Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
title Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
title_full Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
title_fullStr Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
title_full_unstemmed Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
title_short Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
title_sort deep kernel and deep learning for genome-based prediction of single traits in multienvironment breeding trials
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6913188/
https://www.ncbi.nlm.nih.gov/pubmed/31921277
http://dx.doi.org/10.3389/fgene.2019.01168
work_keys_str_mv AT crossajose deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT martinijohanneswr deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT gianoladaniel deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT perezrodriguezpaulino deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT jarquindiego deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT julianaphilomin deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT montesinoslopezosval deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials
AT cuevasjaime deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials