Cargando…

A Stacking Ensemble Learning Framework for Genomic Prediction

Machine learning (ML) is perhaps the most useful tool for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) is currently unsatisfactory. To improve the genomic predictions, we constructed a stacking ensemble learning...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Mang, Chang, Tianpeng, An, Bingxing, Duan, Xinghai, Du, Lili, Wang, Xiaoqiao, Miao, Jian, Xu, Lingyang, Gao, Xue, Zhang, Lupei, Li, Junya, Gao, Huijiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7969712/
https://www.ncbi.nlm.nih.gov/pubmed/33747037
http://dx.doi.org/10.3389/fgene.2021.600040
_version_ 1783666280876736512
author Liang, Mang
Chang, Tianpeng
An, Bingxing
Duan, Xinghai
Du, Lili
Wang, Xiaoqiao
Miao, Jian
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
author_facet Liang, Mang
Chang, Tianpeng
An, Bingxing
Duan, Xinghai
Du, Lili
Wang, Xiaoqiao
Miao, Jian
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
author_sort Liang, Mang
collection PubMed
description Machine learning (ML) is perhaps the most useful tool for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) is currently unsatisfactory. To improve the genomic predictions, we constructed a stacking ensemble learning framework (SELF), integrating three machine learning methods, to predict genomic estimated breeding values (GEBVs). The present study evaluated the prediction ability of SELF by analyzing three real datasets, with different genetic architecture; comparing the prediction accuracy of SELF, base learners, genomic best linear unbiased prediction (GBLUP) and BayesB. For each trait, SELF performed better than base learners, which included support vector regression (SVR), kernel ridge regression (KRR) and elastic net (ENET). The prediction accuracy of SELF was, on average, 7.70% higher than GBLUP in three datasets. Except for the milk fat percentage (MFP) traits, of the German Holstein dairy cattle dataset, SELF was more robust than BayesB in all remaining traits. Therefore, we believed that SEFL has the potential to be promoted to estimate GEBVs in other animals and plants.
format Online
Article
Text
id pubmed-7969712
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79697122021-03-19 A Stacking Ensemble Learning Framework for Genomic Prediction Liang, Mang Chang, Tianpeng An, Bingxing Duan, Xinghai Du, Lili Wang, Xiaoqiao Miao, Jian Xu, Lingyang Gao, Xue Zhang, Lupei Li, Junya Gao, Huijiang Front Genet Genetics Machine learning (ML) is perhaps the most useful tool for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) is currently unsatisfactory. To improve the genomic predictions, we constructed a stacking ensemble learning framework (SELF), integrating three machine learning methods, to predict genomic estimated breeding values (GEBVs). The present study evaluated the prediction ability of SELF by analyzing three real datasets, with different genetic architecture; comparing the prediction accuracy of SELF, base learners, genomic best linear unbiased prediction (GBLUP) and BayesB. For each trait, SELF performed better than base learners, which included support vector regression (SVR), kernel ridge regression (KRR) and elastic net (ENET). The prediction accuracy of SELF was, on average, 7.70% higher than GBLUP in three datasets. Except for the milk fat percentage (MFP) traits, of the German Holstein dairy cattle dataset, SELF was more robust than BayesB in all remaining traits. Therefore, we believed that SEFL has the potential to be promoted to estimate GEBVs in other animals and plants. Frontiers Media S.A. 2021-03-04 /pmc/articles/PMC7969712/ /pubmed/33747037 http://dx.doi.org/10.3389/fgene.2021.600040 Text en Copyright © 2021 Liang, Chang, An, Duan, Du, Wang, Miao, Xu, Gao, Zhang, Li and Gao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Liang, Mang
Chang, Tianpeng
An, Bingxing
Duan, Xinghai
Du, Lili
Wang, Xiaoqiao
Miao, Jian
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
A Stacking Ensemble Learning Framework for Genomic Prediction
title A Stacking Ensemble Learning Framework for Genomic Prediction
title_full A Stacking Ensemble Learning Framework for Genomic Prediction
title_fullStr A Stacking Ensemble Learning Framework for Genomic Prediction
title_full_unstemmed A Stacking Ensemble Learning Framework for Genomic Prediction
title_short A Stacking Ensemble Learning Framework for Genomic Prediction
title_sort stacking ensemble learning framework for genomic prediction
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7969712/
https://www.ncbi.nlm.nih.gov/pubmed/33747037
http://dx.doi.org/10.3389/fgene.2021.600040
work_keys_str_mv AT liangmang astackingensemblelearningframeworkforgenomicprediction
AT changtianpeng astackingensemblelearningframeworkforgenomicprediction
AT anbingxing astackingensemblelearningframeworkforgenomicprediction
AT duanxinghai astackingensemblelearningframeworkforgenomicprediction
AT dulili astackingensemblelearningframeworkforgenomicprediction
AT wangxiaoqiao astackingensemblelearningframeworkforgenomicprediction
AT miaojian astackingensemblelearningframeworkforgenomicprediction
AT xulingyang astackingensemblelearningframeworkforgenomicprediction
AT gaoxue astackingensemblelearningframeworkforgenomicprediction
AT zhanglupei astackingensemblelearningframeworkforgenomicprediction
AT lijunya astackingensemblelearningframeworkforgenomicprediction
AT gaohuijiang astackingensemblelearningframeworkforgenomicprediction
AT liangmang stackingensemblelearningframeworkforgenomicprediction
AT changtianpeng stackingensemblelearningframeworkforgenomicprediction
AT anbingxing stackingensemblelearningframeworkforgenomicprediction
AT duanxinghai stackingensemblelearningframeworkforgenomicprediction
AT dulili stackingensemblelearningframeworkforgenomicprediction
AT wangxiaoqiao stackingensemblelearningframeworkforgenomicprediction
AT miaojian stackingensemblelearningframeworkforgenomicprediction
AT xulingyang stackingensemblelearningframeworkforgenomicprediction
AT gaoxue stackingensemblelearningframeworkforgenomicprediction
AT zhanglupei stackingensemblelearningframeworkforgenomicprediction
AT lijunya stackingensemblelearningframeworkforgenomicprediction
AT gaohuijiang stackingensemblelearningframeworkforgenomicprediction