Cargando…

Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs

BACKGROUND: Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. RESULTS: In this study, 2566 Chinese Yorkshire pigs with reproduction...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xue, Shi, Shaolei, Wang, Guijiang, Luo, Wenxue, Wei, Xia, Qiu, Ao, Luo, Fei, Ding, Xiangdong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9112588/
https://www.ncbi.nlm.nih.gov/pubmed/35578371
http://dx.doi.org/10.1186/s40104-022-00708-0
_version_ 1784709441672708096
author Wang, Xue
Shi, Shaolei
Wang, Guijiang
Luo, Wenxue
Wei, Xia
Qiu, Ao
Luo, Fei
Ding, Xiangdong
author_facet Wang, Xue
Shi, Shaolei
Wang, Guijiang
Luo, Wenxue
Wei, Xia
Qiu, Ao
Luo, Fei
Ding, Xiangdong
author_sort Wang, Xue
collection PubMed
description BACKGROUND: Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. RESULTS: In this study, 2566 Chinese Yorkshire pigs with reproduction trait records were genotyped with the GenoBaits Porcine SNP 50 K and PorcineSNP50 panels. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of fivefold cross-validation (CV) and one prediction for younger individuals, the utility of ML methods in genomic prediction was explored. In CV, compared with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP) and the Bayesian method BayesHE, ML methods significantly outperformed these conventional methods. ML methods improved the genomic prediction accuracy of GBLUP, ssGBLUP, and BayesHE by 19.3%, 15.0% and 20.8%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded an improvement of 3.8% on average in accuracy compared to that of GBLUP, and the accuracy of BayesHE was close to that of GBLUP. In genomic prediction of younger individuals, RF and Adaboost.R2_KRR performed better than GBLUP and BayesHE, while ssGBLUP performed comparably with RF, and ssGBLUP yielded slightly higher accuracy and lower MSE than Adaboost.R2_KRR in the prediction of total number of piglets born, while for number of piglets born alive, Adaboost.R2_KRR performed significantly better than ssGBLUP. Among ML methods, Adaboost.R2_KRR consistently performed well in our study. Our findings also demonstrated that optimal hyperparameters are useful for ML methods. After tuning hyperparameters in CV and in predicting genomic outcomes of younger individuals, the average improvement was 14.3% and 21.8% over those using default hyperparameters, respectively. CONCLUSION: Our findings demonstrated that ML methods had better overall prediction performance than conventional genomic selection methods, and could be new options for genomic prediction. Among ML methods, Adaboost.R2_KRR consistently performed well in our study, and tuning hyperparameters is necessary for ML methods. The optimal hyperparameters depend on the character of traits, datasets etc. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40104-022-00708-0.
format Online
Article
Text
id pubmed-9112588
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-91125882022-05-18 Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs Wang, Xue Shi, Shaolei Wang, Guijiang Luo, Wenxue Wei, Xia Qiu, Ao Luo, Fei Ding, Xiangdong J Anim Sci Biotechnol Research BACKGROUND: Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. RESULTS: In this study, 2566 Chinese Yorkshire pigs with reproduction trait records were genotyped with the GenoBaits Porcine SNP 50 K and PorcineSNP50 panels. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of fivefold cross-validation (CV) and one prediction for younger individuals, the utility of ML methods in genomic prediction was explored. In CV, compared with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP) and the Bayesian method BayesHE, ML methods significantly outperformed these conventional methods. ML methods improved the genomic prediction accuracy of GBLUP, ssGBLUP, and BayesHE by 19.3%, 15.0% and 20.8%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded an improvement of 3.8% on average in accuracy compared to that of GBLUP, and the accuracy of BayesHE was close to that of GBLUP. In genomic prediction of younger individuals, RF and Adaboost.R2_KRR performed better than GBLUP and BayesHE, while ssGBLUP performed comparably with RF, and ssGBLUP yielded slightly higher accuracy and lower MSE than Adaboost.R2_KRR in the prediction of total number of piglets born, while for number of piglets born alive, Adaboost.R2_KRR performed significantly better than ssGBLUP. Among ML methods, Adaboost.R2_KRR consistently performed well in our study. Our findings also demonstrated that optimal hyperparameters are useful for ML methods. After tuning hyperparameters in CV and in predicting genomic outcomes of younger individuals, the average improvement was 14.3% and 21.8% over those using default hyperparameters, respectively. CONCLUSION: Our findings demonstrated that ML methods had better overall prediction performance than conventional genomic selection methods, and could be new options for genomic prediction. Among ML methods, Adaboost.R2_KRR consistently performed well in our study, and tuning hyperparameters is necessary for ML methods. The optimal hyperparameters depend on the character of traits, datasets etc. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40104-022-00708-0. BioMed Central 2022-05-17 /pmc/articles/PMC9112588/ /pubmed/35578371 http://dx.doi.org/10.1186/s40104-022-00708-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Wang, Xue
Shi, Shaolei
Wang, Guijiang
Luo, Wenxue
Wei, Xia
Qiu, Ao
Luo, Fei
Ding, Xiangdong
Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_full Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_fullStr Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_full_unstemmed Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_short Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_sort using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9112588/
https://www.ncbi.nlm.nih.gov/pubmed/35578371
http://dx.doi.org/10.1186/s40104-022-00708-0
work_keys_str_mv AT wangxue usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT shishaolei usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT wangguijiang usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT luowenxue usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT weixia usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT qiuao usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT luofei usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT dingxiangdong usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs