Cargando…

Optimal breeding-value prediction using a sparse selection index

Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and linkage disequilibrium patterns can lead to heterogeneity in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lopez-Cruz, Marco, de los Campos, Gustavo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Investigation
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128408/ https://www.ncbi.nlm.nih.gov/pubmed/33748861 http://dx.doi.org/10.1093/genetics/iyab030

_version_	1783694108406054912
author	Lopez-Cruz, Marco de los Campos, Gustavo
author_facet	Lopez-Cruz, Marco de los Campos, Gustavo
author_sort	Lopez-Cruz, Marco
collection	PubMed
description	Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and linkage disequilibrium patterns can lead to heterogeneity in SNP effects. In this context, calibrating genomic predictions using a large, potentially heterogeneous, training data set may not lead to optimal prediction accuracy. Some studies tried to address this sample size/homogeneity trade-off using training set optimization algorithms; however, this approach assumes that a single training data set is optimum for all individuals in the prediction set. Here, we propose an approach that identifies, for each individual in the prediction set, a subset from the training data (i.e., a set of support points) from which predictions are derived. The methodology that we propose is a sparse selection index (SSI) that integrates selection index methodology with sparsity-inducing techniques commonly used for high-dimensional regression. The sparsity of the resulting index is controlled by a regularization parameter (λ); the G-Best Linear Unbiased Predictor (G-BLUP) (the prediction method most commonly used in plant and animal breeding) appears as a special case which happens when λ = 0. In this study, we present the methodology and demonstrate (using two wheat data sets with phenotypes collected in 10 different environments) that the SSI can achieve significant (anywhere between 5 and 10%) gains in prediction accuracy relative to the G-BLUP.
format	Online Article Text
id	pubmed-8128408
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-81284082021-05-21 Optimal breeding-value prediction using a sparse selection index Lopez-Cruz, Marco de los Campos, Gustavo Genetics Investigation Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and linkage disequilibrium patterns can lead to heterogeneity in SNP effects. In this context, calibrating genomic predictions using a large, potentially heterogeneous, training data set may not lead to optimal prediction accuracy. Some studies tried to address this sample size/homogeneity trade-off using training set optimization algorithms; however, this approach assumes that a single training data set is optimum for all individuals in the prediction set. Here, we propose an approach that identifies, for each individual in the prediction set, a subset from the training data (i.e., a set of support points) from which predictions are derived. The methodology that we propose is a sparse selection index (SSI) that integrates selection index methodology with sparsity-inducing techniques commonly used for high-dimensional regression. The sparsity of the resulting index is controlled by a regularization parameter (λ); the G-Best Linear Unbiased Predictor (G-BLUP) (the prediction method most commonly used in plant and animal breeding) appears as a special case which happens when λ = 0. In this study, we present the methodology and demonstrate (using two wheat data sets with phenotypes collected in 10 different environments) that the SSI can achieve significant (anywhere between 5 and 10%) gains in prediction accuracy relative to the G-BLUP. Oxford University Press 2021-03-20 /pmc/articles/PMC8128408/ /pubmed/33748861 http://dx.doi.org/10.1093/genetics/iyab030 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Investigation Lopez-Cruz, Marco de los Campos, Gustavo Optimal breeding-value prediction using a sparse selection index
title	Optimal breeding-value prediction using a sparse selection index
title_full	Optimal breeding-value prediction using a sparse selection index
title_fullStr	Optimal breeding-value prediction using a sparse selection index
title_full_unstemmed	Optimal breeding-value prediction using a sparse selection index
title_short	Optimal breeding-value prediction using a sparse selection index
title_sort	optimal breeding-value prediction using a sparse selection index
topic	Investigation
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128408/ https://www.ncbi.nlm.nih.gov/pubmed/33748861 http://dx.doi.org/10.1093/genetics/iyab030
work_keys_str_mv	AT lopezcruzmarco optimalbreedingvaluepredictionusingasparseselectionindex AT deloscamposgustavo optimalbreedingvaluepredictionusingasparseselectionindex

Optimal breeding-value prediction using a sparse selection index

Ejemplares similares