Cargando…

Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction

BACKGROUND: A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify t...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Hao, Garrick, Dorian J., Fernando, Rohan L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414316/
https://www.ncbi.nlm.nih.gov/pubmed/28469846
http://dx.doi.org/10.1186/s40104-017-0164-6
_version_ 1783233349451513856
author Cheng, Hao
Garrick, Dorian J.
Fernando, Rohan L.
author_facet Cheng, Hao
Garrick, Dorian J.
Fernando, Rohan L.
author_sort Cheng, Hao
collection PubMed
description BACKGROUND: A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model. METHODS: Naive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. RESULTS: Efficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations. CONCLUSIONS: Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis.
format Online
Article
Text
id pubmed-5414316
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54143162017-05-03 Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction Cheng, Hao Garrick, Dorian J. Fernando, Rohan L. J Anim Sci Biotechnol Methodology Article BACKGROUND: A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model. METHODS: Naive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. RESULTS: Efficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations. CONCLUSIONS: Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. BioMed Central 2017-05-02 /pmc/articles/PMC5414316/ /pubmed/28469846 http://dx.doi.org/10.1186/s40104-017-0164-6 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Cheng, Hao
Garrick, Dorian J.
Fernando, Rohan L.
Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
title Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
title_full Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
title_fullStr Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
title_full_unstemmed Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
title_short Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
title_sort efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414316/
https://www.ncbi.nlm.nih.gov/pubmed/28469846
http://dx.doi.org/10.1186/s40104-017-0164-6
work_keys_str_mv AT chenghao efficientstrategiesforleaveoneoutcrossvalidationforgenomicbestlinearunbiasedprediction
AT garrickdorianj efficientstrategiesforleaveoneoutcrossvalidationforgenomicbestlinearunbiasedprediction
AT fernandorohanl efficientstrategiesforleaveoneoutcrossvalidationforgenomicbestlinearunbiasedprediction