Cargando…

A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction

Genomic prediction is revolutionizing plant breeding since candidate genotypes can be selected without the need to measure their trait in the field. When a reference population contains both phenotypic and genotypic information, it is trained by a statistical machine learning method that is subseque...

Descripción completa

Detalles Bibliográficos
Autores principales: Montesinos-López, Osval A., Carter, Arron H., Bernal-Sandoval, David Alejandro, Cano-Paez, Bernabe, Montesinos-López, Abelardo, Crossa, José
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778581/
https://www.ncbi.nlm.nih.gov/pubmed/36553547
http://dx.doi.org/10.3390/genes13122282
_version_ 1784856397995835392
author Montesinos-López, Osval A.
Carter, Arron H.
Bernal-Sandoval, David Alejandro
Cano-Paez, Bernabe
Montesinos-López, Abelardo
Crossa, José
author_facet Montesinos-López, Osval A.
Carter, Arron H.
Bernal-Sandoval, David Alejandro
Cano-Paez, Bernabe
Montesinos-López, Abelardo
Crossa, José
author_sort Montesinos-López, Osval A.
collection PubMed
description Genomic prediction is revolutionizing plant breeding since candidate genotypes can be selected without the need to measure their trait in the field. When a reference population contains both phenotypic and genotypic information, it is trained by a statistical machine learning method that is subsequently used for making predictions of breeding or phenotypic values of candidate genotypes that were only genotyped. Nevertheless, the successful implementation of the genomic selection (GS) methodology depends on many factors. One key factor is the type of statistical machine learning method used since some are unable to capture nonlinear patterns available in the data. While kernel methods are powerful statistical machine learning algorithms that capture complex nonlinear patterns in the data, their successful implementation strongly depends on the careful tuning process of the involved hyperparameters. As such, in this paper we compare three methods of tuning (manual tuning, grid search, and Bayesian optimization) for the Gaussian kernel under a Bayesian best linear unbiased predictor model. We used six real datasets of wheat (Triticum aestivum L.) to compare the three strategies of tuning. We found that if we want to obtain the major benefits of using Gaussian kernels, it is very important to perform a careful tuning process. The best prediction performance was observed when the tuning process was performed with grid search and Bayesian optimization. However, we did not observe relevant differences between the grid search and Bayesian optimization approach. The observed gains in terms of prediction performance were between 2.1% and 27.8% across the six datasets under study.
format Online
Article
Text
id pubmed-9778581
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97785812022-12-23 A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction Montesinos-López, Osval A. Carter, Arron H. Bernal-Sandoval, David Alejandro Cano-Paez, Bernabe Montesinos-López, Abelardo Crossa, José Genes (Basel) Article Genomic prediction is revolutionizing plant breeding since candidate genotypes can be selected without the need to measure their trait in the field. When a reference population contains both phenotypic and genotypic information, it is trained by a statistical machine learning method that is subsequently used for making predictions of breeding or phenotypic values of candidate genotypes that were only genotyped. Nevertheless, the successful implementation of the genomic selection (GS) methodology depends on many factors. One key factor is the type of statistical machine learning method used since some are unable to capture nonlinear patterns available in the data. While kernel methods are powerful statistical machine learning algorithms that capture complex nonlinear patterns in the data, their successful implementation strongly depends on the careful tuning process of the involved hyperparameters. As such, in this paper we compare three methods of tuning (manual tuning, grid search, and Bayesian optimization) for the Gaussian kernel under a Bayesian best linear unbiased predictor model. We used six real datasets of wheat (Triticum aestivum L.) to compare the three strategies of tuning. We found that if we want to obtain the major benefits of using Gaussian kernels, it is very important to perform a careful tuning process. The best prediction performance was observed when the tuning process was performed with grid search and Bayesian optimization. However, we did not observe relevant differences between the grid search and Bayesian optimization approach. The observed gains in terms of prediction performance were between 2.1% and 27.8% across the six datasets under study. MDPI 2022-12-03 /pmc/articles/PMC9778581/ /pubmed/36553547 http://dx.doi.org/10.3390/genes13122282 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Montesinos-López, Osval A.
Carter, Arron H.
Bernal-Sandoval, David Alejandro
Cano-Paez, Bernabe
Montesinos-López, Abelardo
Crossa, José
A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction
title A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction
title_full A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction
title_fullStr A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction
title_full_unstemmed A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction
title_short A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction
title_sort comparison between three tuning strategies for gaussian kernels in the context of univariate genomic prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778581/
https://www.ncbi.nlm.nih.gov/pubmed/36553547
http://dx.doi.org/10.3390/genes13122282
work_keys_str_mv AT montesinoslopezosvala acomparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT carterarronh acomparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT bernalsandovaldavidalejandro acomparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT canopaezbernabe acomparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT montesinoslopezabelardo acomparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT crossajose acomparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT montesinoslopezosvala comparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT carterarronh comparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT bernalsandovaldavidalejandro comparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT canopaezbernabe comparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT montesinoslopezabelardo comparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction
AT crossajose comparisonbetweenthreetuningstrategiesforgaussiankernelsinthecontextofunivariategenomicprediction