Cargando…

Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits

The usefulness of genomic prediction in crop and livestock breeding programs has prompted efforts to develop new and improved genomic prediction algorithms, such as artificial neural networks and gradient tree boosting. However, the performance of these algorithms has not been compared in a systemat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Azodi, Christina B., Bolger, Emily, McCarren, Andrew, Roantree, Mark, de los Campos, Gustavo, Shiu, Shin-Han
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Genetics Society of America 2019
Materias:	Genomic Prediction
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6829122/ https://www.ncbi.nlm.nih.gov/pubmed/31533955 http://dx.doi.org/10.1534/g3.119.400498

_version_	1783465481263382528
author	Azodi, Christina B. Bolger, Emily McCarren, Andrew Roantree, Mark de los Campos, Gustavo Shiu, Shin-Han
author_facet	Azodi, Christina B. Bolger, Emily McCarren, Andrew Roantree, Mark de los Campos, Gustavo Shiu, Shin-Han
author_sort	Azodi, Christina B.
collection	PubMed
description	The usefulness of genomic prediction in crop and livestock breeding programs has prompted efforts to develop new and improved genomic prediction algorithms, such as artificial neural networks and gradient tree boosting. However, the performance of these algorithms has not been compared in a systematic manner using a wide range of datasets and models. Using data of 18 traits across six plant species with different marker densities and training population sizes, we compared the performance of six linear and six non-linear algorithms. First, we found that hyperparameter selection was necessary for all non-linear algorithms and that feature selection prior to model training was critical for artificial neural networks when the markers greatly outnumbered the number of training lines. Across all species and trait combinations, no one algorithm performed best, however predictions based on a combination of results from multiple algorithms (i.e., ensemble predictions) performed consistently well. While linear and non-linear algorithms performed best for a similar number of traits, the performance of non-linear algorithms vary more between traits. Although artificial neural networks did not perform best for any trait, we identified strategies (i.e., feature selection, seeded starting weights) that boosted their performance to near the level of other algorithms. Our results highlight the importance of algorithm selection for the prediction of trait values.
format	Online Article Text
id	pubmed-6829122
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Genetics Society of America
record_format	MEDLINE/PubMed
spelling	pubmed-68291222019-11-06 Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits Azodi, Christina B. Bolger, Emily McCarren, Andrew Roantree, Mark de los Campos, Gustavo Shiu, Shin-Han G3 (Bethesda) Genomic Prediction The usefulness of genomic prediction in crop and livestock breeding programs has prompted efforts to develop new and improved genomic prediction algorithms, such as artificial neural networks and gradient tree boosting. However, the performance of these algorithms has not been compared in a systematic manner using a wide range of datasets and models. Using data of 18 traits across six plant species with different marker densities and training population sizes, we compared the performance of six linear and six non-linear algorithms. First, we found that hyperparameter selection was necessary for all non-linear algorithms and that feature selection prior to model training was critical for artificial neural networks when the markers greatly outnumbered the number of training lines. Across all species and trait combinations, no one algorithm performed best, however predictions based on a combination of results from multiple algorithms (i.e., ensemble predictions) performed consistently well. While linear and non-linear algorithms performed best for a similar number of traits, the performance of non-linear algorithms vary more between traits. Although artificial neural networks did not perform best for any trait, we identified strategies (i.e., feature selection, seeded starting weights) that boosted their performance to near the level of other algorithms. Our results highlight the importance of algorithm selection for the prediction of trait values. Genetics Society of America 2019-09-18 /pmc/articles/PMC6829122/ /pubmed/31533955 http://dx.doi.org/10.1534/g3.119.400498 Text en Copyright © 2019 Azodi et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Genomic Prediction Azodi, Christina B. Bolger, Emily McCarren, Andrew Roantree, Mark de los Campos, Gustavo Shiu, Shin-Han Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits
title	Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits
title_full	Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits
title_fullStr	Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits
title_full_unstemmed	Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits
title_short	Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits
title_sort	benchmarking parametric and machine learning models for genomic prediction of complex traits
topic	Genomic Prediction
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6829122/ https://www.ncbi.nlm.nih.gov/pubmed/31533955 http://dx.doi.org/10.1534/g3.119.400498
work_keys_str_mv	AT azodichristinab benchmarkingparametricandmachinelearningmodelsforgenomicpredictionofcomplextraits AT bolgeremily benchmarkingparametricandmachinelearningmodelsforgenomicpredictionofcomplextraits AT mccarrenandrew benchmarkingparametricandmachinelearningmodelsforgenomicpredictionofcomplextraits AT roantreemark benchmarkingparametricandmachinelearningmodelsforgenomicpredictionofcomplextraits AT deloscamposgustavo benchmarkingparametricandmachinelearningmodelsforgenomicpredictionofcomplextraits AT shiushinhan benchmarkingparametricandmachinelearningmodelsforgenomicpredictionofcomplextraits

Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits

Ejemplares similares