Cargando…

Genomic prediction using subsampling

BACKGROUND: Genome-wide assisted selection is a critical tool for the genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive comput...

Descripción completa

Detalles Bibliográficos
Autores principales: Xavier, Alencar, Xu, Shizhong, Muir, William, Rainey, Katy Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5366167/
https://www.ncbi.nlm.nih.gov/pubmed/28340551
http://dx.doi.org/10.1186/s12859-017-1582-3
_version_ 1782517542788530176
author Xavier, Alencar
Xu, Shizhong
Muir, William
Rainey, Katy Martin
author_facet Xavier, Alencar
Xu, Shizhong
Muir, William
Rainey, Katy Martin
author_sort Xavier, Alencar
collection PubMed
description BACKGROUND: Genome-wide assisted selection is a critical tool for the genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each round of a Markov Chain Monte Carlo. We evaluated the effect of subsampling bootstrap on prediction and computational parameters. RESULTS: Across datasets, we observed an optimal subsampling proportion of observations around 50% with replacement, and around 33% without replacement. Subsampling provided a substantial decrease in computation time, reducing the time to fit the model by half. On average, losses on predictive properties imposed by subsampling were negligible, usually below 1%. For each dataset, an optimal subsampling point that improves prediction properties was observed, but the improvements were also negligible. CONCLUSION: Combining subsampling with Gibbs sampling is an interesting ensemble algorithm. The investigation indicates that the subsampling bootstrap Markov chain algorithm substantially reduces computational burden associated with model fitting, and it may slightly enhance prediction properties. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1582-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5366167
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53661672017-03-28 Genomic prediction using subsampling Xavier, Alencar Xu, Shizhong Muir, William Rainey, Katy Martin BMC Bioinformatics Research Article BACKGROUND: Genome-wide assisted selection is a critical tool for the genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each round of a Markov Chain Monte Carlo. We evaluated the effect of subsampling bootstrap on prediction and computational parameters. RESULTS: Across datasets, we observed an optimal subsampling proportion of observations around 50% with replacement, and around 33% without replacement. Subsampling provided a substantial decrease in computation time, reducing the time to fit the model by half. On average, losses on predictive properties imposed by subsampling were negligible, usually below 1%. For each dataset, an optimal subsampling point that improves prediction properties was observed, but the improvements were also negligible. CONCLUSION: Combining subsampling with Gibbs sampling is an interesting ensemble algorithm. The investigation indicates that the subsampling bootstrap Markov chain algorithm substantially reduces computational burden associated with model fitting, and it may slightly enhance prediction properties. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1582-3) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-24 /pmc/articles/PMC5366167/ /pubmed/28340551 http://dx.doi.org/10.1186/s12859-017-1582-3 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Xavier, Alencar
Xu, Shizhong
Muir, William
Rainey, Katy Martin
Genomic prediction using subsampling
title Genomic prediction using subsampling
title_full Genomic prediction using subsampling
title_fullStr Genomic prediction using subsampling
title_full_unstemmed Genomic prediction using subsampling
title_short Genomic prediction using subsampling
title_sort genomic prediction using subsampling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5366167/
https://www.ncbi.nlm.nih.gov/pubmed/28340551
http://dx.doi.org/10.1186/s12859-017-1582-3
work_keys_str_mv AT xavieralencar genomicpredictionusingsubsampling
AT xushizhong genomicpredictionusingsubsampling
AT muirwilliam genomicpredictionusingsubsampling
AT raineykatymartin genomicpredictionusingsubsampling