Cargando…
Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction
Genomic prediction is a statistical method to predict phenotypes of polygenic traits using high-throughput genomic data. Most diseases and behaviors in humans and animals are polygenic traits. The majority of agronomic traits in crops are also polygenic. Accurate prediction of these traits can help...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345720/ https://www.ncbi.nlm.nih.gov/pubmed/28108552 http://dx.doi.org/10.1534/g3.116.038059 |
_version_ | 1782513771255693312 |
---|---|
author | Xu, Shizhong |
author_facet | Xu, Shizhong |
author_sort | Xu, Shizhong |
collection | PubMed |
description | Genomic prediction is a statistical method to predict phenotypes of polygenic traits using high-throughput genomic data. Most diseases and behaviors in humans and animals are polygenic traits. The majority of agronomic traits in crops are also polygenic. Accurate prediction of these traits can help medical professionals diagnose acute diseases and breeders to increase food products, and therefore significantly contribute to human health and global food security. The best linear unbiased prediction (BLUP) is an important tool to analyze high-throughput genomic data for prediction. However, to judge the efficacy of the BLUP model with a particular set of predictors for a given trait, one has to provide an unbiased mechanism to evaluate the predictability. Cross-validation (CV) is an essential tool to achieve this goal, where a sample is partitioned into K parts of roughly equal size, one part is predicted using parameters estimated from the remaining K – 1 parts, and eventually every part is predicted using a sample excluding that part. Such a CV is called the K-fold CV. Unfortunately, CV presents a substantial increase in computational burden. We developed an alternative method, the HAT method, to replace CV. The new method corrects the estimated residual errors from the whole sample analysis using the leverage values of a hat matrix of the random effects to achieve the predicted residual errors. Properties of the HAT method were investigated using seven agronomic and 1000 metabolomic traits of an inbred rice population. Results showed that the HAT method is a very good approximation of the CV method. The method was also applied to 10 traits in 1495 hybrid rice with 1.6 million SNPs, and to human height of 6161 subjects with roughly 0.5 million SNPs of the Framingham heart study data. Predictabilities of the HAT and CV methods were all similar. The HAT method allows us to easily evaluate the predictabilities of genomic prediction for large numbers of traits in very large populations. |
format | Online Article Text |
id | pubmed-5345720 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-53457202017-03-21 Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction Xu, Shizhong G3 (Bethesda) Genomic Selection Genomic prediction is a statistical method to predict phenotypes of polygenic traits using high-throughput genomic data. Most diseases and behaviors in humans and animals are polygenic traits. The majority of agronomic traits in crops are also polygenic. Accurate prediction of these traits can help medical professionals diagnose acute diseases and breeders to increase food products, and therefore significantly contribute to human health and global food security. The best linear unbiased prediction (BLUP) is an important tool to analyze high-throughput genomic data for prediction. However, to judge the efficacy of the BLUP model with a particular set of predictors for a given trait, one has to provide an unbiased mechanism to evaluate the predictability. Cross-validation (CV) is an essential tool to achieve this goal, where a sample is partitioned into K parts of roughly equal size, one part is predicted using parameters estimated from the remaining K – 1 parts, and eventually every part is predicted using a sample excluding that part. Such a CV is called the K-fold CV. Unfortunately, CV presents a substantial increase in computational burden. We developed an alternative method, the HAT method, to replace CV. The new method corrects the estimated residual errors from the whole sample analysis using the leverage values of a hat matrix of the random effects to achieve the predicted residual errors. Properties of the HAT method were investigated using seven agronomic and 1000 metabolomic traits of an inbred rice population. Results showed that the HAT method is a very good approximation of the CV method. The method was also applied to 10 traits in 1495 hybrid rice with 1.6 million SNPs, and to human height of 6161 subjects with roughly 0.5 million SNPs of the Framingham heart study data. Predictabilities of the HAT and CV methods were all similar. The HAT method allows us to easily evaluate the predictabilities of genomic prediction for large numbers of traits in very large populations. Genetics Society of America 2017-01-19 /pmc/articles/PMC5345720/ /pubmed/28108552 http://dx.doi.org/10.1534/g3.116.038059 Text en Copyright © 2017 Xu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genomic Selection Xu, Shizhong Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction |
title | Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction |
title_full | Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction |
title_fullStr | Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction |
title_full_unstemmed | Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction |
title_short | Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction |
title_sort | predicted residual error sum of squares of mixed models: an application for genomic prediction |
topic | Genomic Selection |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345720/ https://www.ncbi.nlm.nih.gov/pubmed/28108552 http://dx.doi.org/10.1534/g3.116.038059 |
work_keys_str_mv | AT xushizhong predictedresidualerrorsumofsquaresofmixedmodelsanapplicationforgenomicprediction |