Cargando…
Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
BACKGROUND: A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414316/ https://www.ncbi.nlm.nih.gov/pubmed/28469846 http://dx.doi.org/10.1186/s40104-017-0164-6 |
_version_ | 1783233349451513856 |
---|---|
author | Cheng, Hao Garrick, Dorian J. Fernando, Rohan L. |
author_facet | Cheng, Hao Garrick, Dorian J. Fernando, Rohan L. |
author_sort | Cheng, Hao |
collection | PubMed |
description | BACKGROUND: A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model. METHODS: Naive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. RESULTS: Efficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations. CONCLUSIONS: Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. |
format | Online Article Text |
id | pubmed-5414316 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-54143162017-05-03 Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction Cheng, Hao Garrick, Dorian J. Fernando, Rohan L. J Anim Sci Biotechnol Methodology Article BACKGROUND: A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model. METHODS: Naive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. RESULTS: Efficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations. CONCLUSIONS: Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. BioMed Central 2017-05-02 /pmc/articles/PMC5414316/ /pubmed/28469846 http://dx.doi.org/10.1186/s40104-017-0164-6 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Cheng, Hao Garrick, Dorian J. Fernando, Rohan L. Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
title | Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
title_full | Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
title_fullStr | Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
title_full_unstemmed | Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
title_short | Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
title_sort | efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414316/ https://www.ncbi.nlm.nih.gov/pubmed/28469846 http://dx.doi.org/10.1186/s40104-017-0164-6 |
work_keys_str_mv | AT chenghao efficientstrategiesforleaveoneoutcrossvalidationforgenomicbestlinearunbiasedprediction AT garrickdorianj efficientstrategiesforleaveoneoutcrossvalidationforgenomicbestlinearunbiasedprediction AT fernandorohanl efficientstrategiesforleaveoneoutcrossvalidationforgenomicbestlinearunbiasedprediction |