Cargando…

Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes

In the past decades, genomic prediction has had a large impact on plant breeding. Given the current advances of high-throughput phenotyping and sequencing technologies, it is increasingly common to observe a large number of traits, in addition to the target trait of interest. This raises the importa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Arouisse, Bader, Theeuwen, Tom P. J. M., van Eeuwijk, Fred A., Kruijer, Willem
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8181460/ https://www.ncbi.nlm.nih.gov/pubmed/34108993 http://dx.doi.org/10.3389/fgene.2021.667358

_version_	1783704096572702720
author	Arouisse, Bader Theeuwen, Tom P. J. M. van Eeuwijk, Fred A. Kruijer, Willem
author_facet	Arouisse, Bader Theeuwen, Tom P. J. M. van Eeuwijk, Fred A. Kruijer, Willem
author_sort	Arouisse, Bader
collection	PubMed
description	In the past decades, genomic prediction has had a large impact on plant breeding. Given the current advances of high-throughput phenotyping and sequencing technologies, it is increasingly common to observe a large number of traits, in addition to the target trait of interest. This raises the important question whether these additional or “secondary” traits can be used to improve genomic prediction for the target trait. With only a small number of secondary traits, this is known to be the case, given sufficiently high heritabilities and genetic correlations. Here we focus on the more challenging situation with a large number of secondary traits, which is increasingly common since the arrival of high-throughput phenotyping. In this case, secondary traits are usually incorporated through additional relatedness matrices. This approach is however infeasible when secondary traits are not measured on the test set, and cannot distinguish between genetic and non-genetic correlations. An alternative direction is to extend the classical selection indices using penalized regression. So far, penalized selection indices have not been applied in a genomic prediction setting, and require plot-level data in order to reliably estimate genetic correlations. Here we aim to overcome these limitations, using two novel approaches. Our first approach relies on a dimension reduction of the secondary traits, using either penalized regression or random forests (LS-BLUP/RF-BLUP). We then compute the bivariate GBLUP with the dimension reduction as secondary trait. For simulated data (with available plot-level data), we also use bivariate GBLUP with the penalized selection index as secondary trait (SI-BLUP). In our second approach (GM-BLUP), we follow existing multi-kernel methods but replace secondary traits by their genomic predictions, with the advantage that genomic prediction is also possible when secondary traits are only measured on the training set. For most of our simulated data, SI-BLUP was most accurate, often closely followed by RF-BLUP or LS-BLUP. In real datasets, involving metabolites in Arabidopsis and transcriptomics in maize, no method could substantially improve over univariate prediction when secondary traits were only available on the training set. LS-BLUP and RF-BLUP were most accurate when secondary traits were available also for the test set.
format	Online Article Text
id	pubmed-8181460
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-81814602021-06-08 Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes Arouisse, Bader Theeuwen, Tom P. J. M. van Eeuwijk, Fred A. Kruijer, Willem Front Genet Genetics In the past decades, genomic prediction has had a large impact on plant breeding. Given the current advances of high-throughput phenotyping and sequencing technologies, it is increasingly common to observe a large number of traits, in addition to the target trait of interest. This raises the important question whether these additional or “secondary” traits can be used to improve genomic prediction for the target trait. With only a small number of secondary traits, this is known to be the case, given sufficiently high heritabilities and genetic correlations. Here we focus on the more challenging situation with a large number of secondary traits, which is increasingly common since the arrival of high-throughput phenotyping. In this case, secondary traits are usually incorporated through additional relatedness matrices. This approach is however infeasible when secondary traits are not measured on the test set, and cannot distinguish between genetic and non-genetic correlations. An alternative direction is to extend the classical selection indices using penalized regression. So far, penalized selection indices have not been applied in a genomic prediction setting, and require plot-level data in order to reliably estimate genetic correlations. Here we aim to overcome these limitations, using two novel approaches. Our first approach relies on a dimension reduction of the secondary traits, using either penalized regression or random forests (LS-BLUP/RF-BLUP). We then compute the bivariate GBLUP with the dimension reduction as secondary trait. For simulated data (with available plot-level data), we also use bivariate GBLUP with the penalized selection index as secondary trait (SI-BLUP). In our second approach (GM-BLUP), we follow existing multi-kernel methods but replace secondary traits by their genomic predictions, with the advantage that genomic prediction is also possible when secondary traits are only measured on the training set. For most of our simulated data, SI-BLUP was most accurate, often closely followed by RF-BLUP or LS-BLUP. In real datasets, involving metabolites in Arabidopsis and transcriptomics in maize, no method could substantially improve over univariate prediction when secondary traits were only available on the training set. LS-BLUP and RF-BLUP were most accurate when secondary traits were available also for the test set. Frontiers Media S.A. 2021-05-24 /pmc/articles/PMC8181460/ /pubmed/34108993 http://dx.doi.org/10.3389/fgene.2021.667358 Text en Copyright © 2021 Arouisse, Theeuwen, van Eeuwijk and Kruijer. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Arouisse, Bader Theeuwen, Tom P. J. M. van Eeuwijk, Fred A. Kruijer, Willem Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes
title	Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes
title_full	Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes
title_fullStr	Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes
title_full_unstemmed	Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes
title_short	Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes
title_sort	improving genomic prediction using high-dimensional secondary phenotypes
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8181460/ https://www.ncbi.nlm.nih.gov/pubmed/34108993 http://dx.doi.org/10.3389/fgene.2021.667358
work_keys_str_mv	AT arouissebader improvinggenomicpredictionusinghighdimensionalsecondaryphenotypes AT theeuwentompjm improvinggenomicpredictionusinghighdimensionalsecondaryphenotypes AT vaneeuwijkfreda improvinggenomicpredictionusinghighdimensionalsecondaryphenotypes AT kruijerwillem improvinggenomicpredictionusinghighdimensionalsecondaryphenotypes

Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes

Ejemplares similares