Cargando…
Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials
Deep learning (DL) is a promising method for genomic-enabled prediction. However, the implementation of DL is difficult because many hyperparameters (number of hidden layers, number of neurons, learning rate, number of epochs, batch size, etc.) need to be tuned. For this reason, deep kernel methods,...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6913188/ https://www.ncbi.nlm.nih.gov/pubmed/31921277 http://dx.doi.org/10.3389/fgene.2019.01168 |
_version_ | 1783479616024870912 |
---|---|
author | Crossa, José Martini, Johannes W.R. Gianola, Daniel Pérez-Rodríguez, Paulino Jarquin, Diego Juliana, Philomin Montesinos-López, Osval Cuevas, Jaime |
author_facet | Crossa, José Martini, Johannes W.R. Gianola, Daniel Pérez-Rodríguez, Paulino Jarquin, Diego Juliana, Philomin Montesinos-López, Osval Cuevas, Jaime |
author_sort | Crossa, José |
collection | PubMed |
description | Deep learning (DL) is a promising method for genomic-enabled prediction. However, the implementation of DL is difficult because many hyperparameters (number of hidden layers, number of neurons, learning rate, number of epochs, batch size, etc.) need to be tuned. For this reason, deep kernel methods, which only require defining the number of layers, may be an attractive alternative. Deep kernel methods emulate DL models with a large number of neurons, but are defined by relatively easily computed covariance matrices. In this research, we compared the genome-based prediction of DL to a deep kernel (arc-cosine kernel, AK), to the commonly used non-additive Gaussian kernel (GK), as well as to the conventional additive genomic best linear unbiased predictor (GBLUP/GB). We used two real wheat data sets for benchmarking these methods. On average, AK and GK outperformed DL and GB. The gain in terms of prediction performance of AK and GK over DL and GB was not large, but AK and GK have the advantage that only one parameter, the number of layers (AK) or the bandwidth parameter (GK), has to be tuned in each method. Furthermore, although AK and GK had similar performance, deep kernel AK is easier to implement than GK, since the parameter “number of layers” is more easily determined than the bandwidth parameter of GK. Comparing AK and DL for the data set of year 2015–2016, the difference in performance of the two methods was bigger, with AK predicting much better than DL. On this data, the optimization of the hyperparameters for DL was difficult and the finally used parameters may have been suboptimal. Our results suggest that AK is a good alternative to DL with the advantage that practically no tuning process is required. |
format | Online Article Text |
id | pubmed-6913188 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-69131882020-01-09 Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials Crossa, José Martini, Johannes W.R. Gianola, Daniel Pérez-Rodríguez, Paulino Jarquin, Diego Juliana, Philomin Montesinos-López, Osval Cuevas, Jaime Front Genet Genetics Deep learning (DL) is a promising method for genomic-enabled prediction. However, the implementation of DL is difficult because many hyperparameters (number of hidden layers, number of neurons, learning rate, number of epochs, batch size, etc.) need to be tuned. For this reason, deep kernel methods, which only require defining the number of layers, may be an attractive alternative. Deep kernel methods emulate DL models with a large number of neurons, but are defined by relatively easily computed covariance matrices. In this research, we compared the genome-based prediction of DL to a deep kernel (arc-cosine kernel, AK), to the commonly used non-additive Gaussian kernel (GK), as well as to the conventional additive genomic best linear unbiased predictor (GBLUP/GB). We used two real wheat data sets for benchmarking these methods. On average, AK and GK outperformed DL and GB. The gain in terms of prediction performance of AK and GK over DL and GB was not large, but AK and GK have the advantage that only one parameter, the number of layers (AK) or the bandwidth parameter (GK), has to be tuned in each method. Furthermore, although AK and GK had similar performance, deep kernel AK is easier to implement than GK, since the parameter “number of layers” is more easily determined than the bandwidth parameter of GK. Comparing AK and DL for the data set of year 2015–2016, the difference in performance of the two methods was bigger, with AK predicting much better than DL. On this data, the optimization of the hyperparameters for DL was difficult and the finally used parameters may have been suboptimal. Our results suggest that AK is a good alternative to DL with the advantage that practically no tuning process is required. Frontiers Media S.A. 2019-12-09 /pmc/articles/PMC6913188/ /pubmed/31921277 http://dx.doi.org/10.3389/fgene.2019.01168 Text en Copyright © 2019 Crossa, Martini, Gianola, Pérez-Rodríguez, Jarquin, Juliana, Montesinos-López and Cuevas http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Crossa, José Martini, Johannes W.R. Gianola, Daniel Pérez-Rodríguez, Paulino Jarquin, Diego Juliana, Philomin Montesinos-López, Osval Cuevas, Jaime Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials |
title | Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials |
title_full | Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials |
title_fullStr | Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials |
title_full_unstemmed | Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials |
title_short | Deep Kernel and Deep Learning for Genome-Based Prediction of Single Traits in Multienvironment Breeding Trials |
title_sort | deep kernel and deep learning for genome-based prediction of single traits in multienvironment breeding trials |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6913188/ https://www.ncbi.nlm.nih.gov/pubmed/31921277 http://dx.doi.org/10.3389/fgene.2019.01168 |
work_keys_str_mv | AT crossajose deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT martinijohanneswr deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT gianoladaniel deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT perezrodriguezpaulino deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT jarquindiego deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT julianaphilomin deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT montesinoslopezosval deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials AT cuevasjaime deepkernelanddeeplearningforgenomebasedpredictionofsingletraitsinmultienvironmentbreedingtrials |