Cargando…

Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction

Genomic prediction is a statistical method to predict phenotypes of polygenic traits using high-throughput genomic data. Most diseases and behaviors in humans and animals are polygenic traits. The majority of agronomic traits in crops are also polygenic. Accurate prediction of these traits can help...

Descripción completa

Detalles Bibliográficos
Autor principal:	Xu, Shizhong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Genetics Society of America 2017
Materias:	Genomic Selection
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345720/ https://www.ncbi.nlm.nih.gov/pubmed/28108552 http://dx.doi.org/10.1534/g3.116.038059

_version_	1782513771255693312
author	Xu, Shizhong
author_facet	Xu, Shizhong
author_sort	Xu, Shizhong
collection	PubMed
description	Genomic prediction is a statistical method to predict phenotypes of polygenic traits using high-throughput genomic data. Most diseases and behaviors in humans and animals are polygenic traits. The majority of agronomic traits in crops are also polygenic. Accurate prediction of these traits can help medical professionals diagnose acute diseases and breeders to increase food products, and therefore significantly contribute to human health and global food security. The best linear unbiased prediction (BLUP) is an important tool to analyze high-throughput genomic data for prediction. However, to judge the efficacy of the BLUP model with a particular set of predictors for a given trait, one has to provide an unbiased mechanism to evaluate the predictability. Cross-validation (CV) is an essential tool to achieve this goal, where a sample is partitioned into K parts of roughly equal size, one part is predicted using parameters estimated from the remaining K – 1 parts, and eventually every part is predicted using a sample excluding that part. Such a CV is called the K-fold CV. Unfortunately, CV presents a substantial increase in computational burden. We developed an alternative method, the HAT method, to replace CV. The new method corrects the estimated residual errors from the whole sample analysis using the leverage values of a hat matrix of the random effects to achieve the predicted residual errors. Properties of the HAT method were investigated using seven agronomic and 1000 metabolomic traits of an inbred rice population. Results showed that the HAT method is a very good approximation of the CV method. The method was also applied to 10 traits in 1495 hybrid rice with 1.6 million SNPs, and to human height of 6161 subjects with roughly 0.5 million SNPs of the Framingham heart study data. Predictabilities of the HAT and CV methods were all similar. The HAT method allows us to easily evaluate the predictabilities of genomic prediction for large numbers of traits in very large populations.
format	Online Article Text
id	pubmed-5345720
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Genetics Society of America
record_format	MEDLINE/PubMed
spelling	pubmed-53457202017-03-21 Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction Xu, Shizhong G3 (Bethesda) Genomic Selection Genomic prediction is a statistical method to predict phenotypes of polygenic traits using high-throughput genomic data. Most diseases and behaviors in humans and animals are polygenic traits. The majority of agronomic traits in crops are also polygenic. Accurate prediction of these traits can help medical professionals diagnose acute diseases and breeders to increase food products, and therefore significantly contribute to human health and global food security. The best linear unbiased prediction (BLUP) is an important tool to analyze high-throughput genomic data for prediction. However, to judge the efficacy of the BLUP model with a particular set of predictors for a given trait, one has to provide an unbiased mechanism to evaluate the predictability. Cross-validation (CV) is an essential tool to achieve this goal, where a sample is partitioned into K parts of roughly equal size, one part is predicted using parameters estimated from the remaining K – 1 parts, and eventually every part is predicted using a sample excluding that part. Such a CV is called the K-fold CV. Unfortunately, CV presents a substantial increase in computational burden. We developed an alternative method, the HAT method, to replace CV. The new method corrects the estimated residual errors from the whole sample analysis using the leverage values of a hat matrix of the random effects to achieve the predicted residual errors. Properties of the HAT method were investigated using seven agronomic and 1000 metabolomic traits of an inbred rice population. Results showed that the HAT method is a very good approximation of the CV method. The method was also applied to 10 traits in 1495 hybrid rice with 1.6 million SNPs, and to human height of 6161 subjects with roughly 0.5 million SNPs of the Framingham heart study data. Predictabilities of the HAT and CV methods were all similar. The HAT method allows us to easily evaluate the predictabilities of genomic prediction for large numbers of traits in very large populations. Genetics Society of America 2017-01-19 /pmc/articles/PMC5345720/ /pubmed/28108552 http://dx.doi.org/10.1534/g3.116.038059 Text en Copyright © 2017 Xu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Genomic Selection Xu, Shizhong Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
title	Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
title_full	Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
title_fullStr	Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
title_full_unstemmed	Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
title_short	Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
title_sort	predicted residual error sum of squares of mixed models: an application for genomic prediction
topic	Genomic Selection
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345720/ https://www.ncbi.nlm.nih.gov/pubmed/28108552 http://dx.doi.org/10.1534/g3.116.038059
work_keys_str_mv	AT xushizhong predictedresidualerrorsumofsquaresofmixedmodelsanapplicationforgenomicprediction

Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction

Ejemplares similares