Cargando…

Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization

SIMPLE SUMMARY: Machine learning has been a crucial implement for genomic prediction. However, the complicated process of tuning hyperparameters tremendously hindered its application in actual breeding programs, especially for people without experience tuning hyperparameters. In this study, we appli...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Mang, An, Bingxing, Li, Keanning, Du, Lili, Deng, Tianyu, Cao, Sheng, Du, Yueying, Xu, Lingyang, Gao, Xue, Zhang, Lupei, Li, Junya, Gao, Huijiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9688023/
https://www.ncbi.nlm.nih.gov/pubmed/36421361
http://dx.doi.org/10.3390/biology11111647
_version_ 1784836161863155712
author Liang, Mang
An, Bingxing
Li, Keanning
Du, Lili
Deng, Tianyu
Cao, Sheng
Du, Yueying
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
author_facet Liang, Mang
An, Bingxing
Li, Keanning
Du, Lili
Deng, Tianyu
Cao, Sheng
Du, Yueying
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
author_sort Liang, Mang
collection PubMed
description SIMPLE SUMMARY: Machine learning has been a crucial implement for genomic prediction. However, the complicated process of tuning hyperparameters tremendously hindered its application in actual breeding programs, especially for people without experience tuning hyperparameters. In this study, we applied a tree-structured Parzen estimator (TPE) to tune the hyperparameters of machine learning methods. Overall, incorporating kernel ridge regression (KRR) with TPE achieved the highest prediction accuracy in simulation and real datasets. ABSTRACT: Depending on excellent prediction ability, machine learning has been considered the most powerful implement to analyze high-throughput sequencing genome data. However, the sophisticated process of tuning hyperparameters tremendously impedes the wider application of machine learning in animal and plant breeding programs. Therefore, we integrated an automatic tuning hyperparameters algorithm, tree-structured Parzen estimator (TPE), with machine learning to simplify the process of using machine learning for genomic prediction. In this study, we applied TPE to optimize the hyperparameters of Kernel ridge regression (KRR) and support vector regression (SVR). To evaluate the performance of TPE, we compared the prediction accuracy of KRR-TPE and SVR-TPE with the genomic best linear unbiased prediction (GBLUP) and KRR-RS, KRR-Grid, SVR-RS, and SVR-Grid, which tuned the hyperparameters of KRR and SVR by using random search (RS) and grid search (Gird) in a simulation dataset and the real datasets. The results indicated that KRR-TPE achieved the most powerful prediction ability considering all populations and was the most convenient. Especially for the Chinese Simmental beef cattle and Loblolly pine populations, the prediction accuracy of KRR-TPE had an 8.73% and 6.08% average improvement compared with GBLUP, respectively. Our study will greatly promote the application of machine learning in GP and further accelerate breeding progress.
format Online
Article
Text
id pubmed-9688023
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96880232022-11-25 Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization Liang, Mang An, Bingxing Li, Keanning Du, Lili Deng, Tianyu Cao, Sheng Du, Yueying Xu, Lingyang Gao, Xue Zhang, Lupei Li, Junya Gao, Huijiang Biology (Basel) Article SIMPLE SUMMARY: Machine learning has been a crucial implement for genomic prediction. However, the complicated process of tuning hyperparameters tremendously hindered its application in actual breeding programs, especially for people without experience tuning hyperparameters. In this study, we applied a tree-structured Parzen estimator (TPE) to tune the hyperparameters of machine learning methods. Overall, incorporating kernel ridge regression (KRR) with TPE achieved the highest prediction accuracy in simulation and real datasets. ABSTRACT: Depending on excellent prediction ability, machine learning has been considered the most powerful implement to analyze high-throughput sequencing genome data. However, the sophisticated process of tuning hyperparameters tremendously impedes the wider application of machine learning in animal and plant breeding programs. Therefore, we integrated an automatic tuning hyperparameters algorithm, tree-structured Parzen estimator (TPE), with machine learning to simplify the process of using machine learning for genomic prediction. In this study, we applied TPE to optimize the hyperparameters of Kernel ridge regression (KRR) and support vector regression (SVR). To evaluate the performance of TPE, we compared the prediction accuracy of KRR-TPE and SVR-TPE with the genomic best linear unbiased prediction (GBLUP) and KRR-RS, KRR-Grid, SVR-RS, and SVR-Grid, which tuned the hyperparameters of KRR and SVR by using random search (RS) and grid search (Gird) in a simulation dataset and the real datasets. The results indicated that KRR-TPE achieved the most powerful prediction ability considering all populations and was the most convenient. Especially for the Chinese Simmental beef cattle and Loblolly pine populations, the prediction accuracy of KRR-TPE had an 8.73% and 6.08% average improvement compared with GBLUP, respectively. Our study will greatly promote the application of machine learning in GP and further accelerate breeding progress. MDPI 2022-11-11 /pmc/articles/PMC9688023/ /pubmed/36421361 http://dx.doi.org/10.3390/biology11111647 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liang, Mang
An, Bingxing
Li, Keanning
Du, Lili
Deng, Tianyu
Cao, Sheng
Du, Yueying
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_full Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_fullStr Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_full_unstemmed Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_short Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_sort improving genomic prediction with machine learning incorporating tpe for hyperparameters optimization
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9688023/
https://www.ncbi.nlm.nih.gov/pubmed/36421361
http://dx.doi.org/10.3390/biology11111647
work_keys_str_mv AT liangmang improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT anbingxing improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT likeanning improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT dulili improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT dengtianyu improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT caosheng improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT duyueying improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT xulingyang improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT gaoxue improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT zhanglupei improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT lijunya improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT gaohuijiang improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization