Cargando…

An Upper Bound for Accuracy of Prediction Using GBLUP

This study aims at characterizing the asymptotic behavior of genomic prediction R(2) as the size of the reference population increases for common or rare QTL alleles through simulations. Haplotypes derived from whole-genome sequence of 85 Caucasian individuals from the 1,000 Genomes Project were use...

Descripción completa

Detalles Bibliográficos
Autores principales: Karaman, Emre, Cheng, Hao, Firat, Mehmet Z., Garrick, Dorian J., Fernando, Rohan L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4986954/
https://www.ncbi.nlm.nih.gov/pubmed/27529480
http://dx.doi.org/10.1371/journal.pone.0161054
_version_ 1782448245954314240
author Karaman, Emre
Cheng, Hao
Firat, Mehmet Z.
Garrick, Dorian J.
Fernando, Rohan L.
author_facet Karaman, Emre
Cheng, Hao
Firat, Mehmet Z.
Garrick, Dorian J.
Fernando, Rohan L.
author_sort Karaman, Emre
collection PubMed
description This study aims at characterizing the asymptotic behavior of genomic prediction R(2) as the size of the reference population increases for common or rare QTL alleles through simulations. Haplotypes derived from whole-genome sequence of 85 Caucasian individuals from the 1,000 Genomes Project were used to simulate random mating in a population of 10,000 individuals for at least 100 generations to create the LD structure in humans for a large number of individuals. To reduce computational demands, only SNPs within a 0.1M region of each of the first 5 chromosomes were used in simulations, and therefore, the total genome length simulated was 0.5M. When the genome length is 30M, to get the same genomic prediction R(2) as with a 0.5M genome would require a reference population 60 fold larger. Three scenarios were considered varying in minor allele frequency distributions of markers and QTL, for h(2) = 0.8 resembling height in humans. Total number of markers was 4,200 and QTL were 70 for each scenario. In this study, we considered the prediction accuracy in terms of an estimability problem, and thereby provided an upper bound for reliability of prediction, and thus, for prediction R(2). Genomic prediction methods GBLUP, BayesB and BayesC were compared. Our results imply that for human height variable selection methods BayesB and BayesC applied to a 30M genome have no advantage over GBLUP when the size of reference population was small (<6,000 individuals), but are superior as more individuals are included in the reference population. All methods become asymptotically equivalent in terms of prediction R(2), which approaches genomic heritability when the size of the reference population reaches 480,000 individuals.
format Online
Article
Text
id pubmed-4986954
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49869542016-08-29 An Upper Bound for Accuracy of Prediction Using GBLUP Karaman, Emre Cheng, Hao Firat, Mehmet Z. Garrick, Dorian J. Fernando, Rohan L. PLoS One Research Article This study aims at characterizing the asymptotic behavior of genomic prediction R(2) as the size of the reference population increases for common or rare QTL alleles through simulations. Haplotypes derived from whole-genome sequence of 85 Caucasian individuals from the 1,000 Genomes Project were used to simulate random mating in a population of 10,000 individuals for at least 100 generations to create the LD structure in humans for a large number of individuals. To reduce computational demands, only SNPs within a 0.1M region of each of the first 5 chromosomes were used in simulations, and therefore, the total genome length simulated was 0.5M. When the genome length is 30M, to get the same genomic prediction R(2) as with a 0.5M genome would require a reference population 60 fold larger. Three scenarios were considered varying in minor allele frequency distributions of markers and QTL, for h(2) = 0.8 resembling height in humans. Total number of markers was 4,200 and QTL were 70 for each scenario. In this study, we considered the prediction accuracy in terms of an estimability problem, and thereby provided an upper bound for reliability of prediction, and thus, for prediction R(2). Genomic prediction methods GBLUP, BayesB and BayesC were compared. Our results imply that for human height variable selection methods BayesB and BayesC applied to a 30M genome have no advantage over GBLUP when the size of reference population was small (<6,000 individuals), but are superior as more individuals are included in the reference population. All methods become asymptotically equivalent in terms of prediction R(2), which approaches genomic heritability when the size of the reference population reaches 480,000 individuals. Public Library of Science 2016-08-16 /pmc/articles/PMC4986954/ /pubmed/27529480 http://dx.doi.org/10.1371/journal.pone.0161054 Text en © 2016 Karaman et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Karaman, Emre
Cheng, Hao
Firat, Mehmet Z.
Garrick, Dorian J.
Fernando, Rohan L.
An Upper Bound for Accuracy of Prediction Using GBLUP
title An Upper Bound for Accuracy of Prediction Using GBLUP
title_full An Upper Bound for Accuracy of Prediction Using GBLUP
title_fullStr An Upper Bound for Accuracy of Prediction Using GBLUP
title_full_unstemmed An Upper Bound for Accuracy of Prediction Using GBLUP
title_short An Upper Bound for Accuracy of Prediction Using GBLUP
title_sort upper bound for accuracy of prediction using gblup
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4986954/
https://www.ncbi.nlm.nih.gov/pubmed/27529480
http://dx.doi.org/10.1371/journal.pone.0161054
work_keys_str_mv AT karamanemre anupperboundforaccuracyofpredictionusinggblup
AT chenghao anupperboundforaccuracyofpredictionusinggblup
AT firatmehmetz anupperboundforaccuracyofpredictionusinggblup
AT garrickdorianj anupperboundforaccuracyofpredictionusinggblup
AT fernandorohanl anupperboundforaccuracyofpredictionusinggblup
AT karamanemre upperboundforaccuracyofpredictionusinggblup
AT chenghao upperboundforaccuracyofpredictionusinggblup
AT firatmehmetz upperboundforaccuracyofpredictionusinggblup
AT garrickdorianj upperboundforaccuracyofpredictionusinggblup
AT fernandorohanl upperboundforaccuracyofpredictionusinggblup