Cargando…

Improving polygenic risk prediction from summary statistics by an empirical Bayes approach

Polygenic risk scores (PRS) from genome-wide association studies (GWAS) are increasingly used to predict disease risks. However some included variants could be false positives and the raw estimates of effect sizes from them may be subject to selection bias. In addition, the standard PRS approach req...

Descripción completa

Detalles Bibliográficos
Autores principales: So, Hon-Cheong, Sham, Pak C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5286518/
https://www.ncbi.nlm.nih.gov/pubmed/28145530
http://dx.doi.org/10.1038/srep41262
_version_ 1782504017382866944
author So, Hon-Cheong
Sham, Pak C.
author_facet So, Hon-Cheong
Sham, Pak C.
author_sort So, Hon-Cheong
collection PubMed
description Polygenic risk scores (PRS) from genome-wide association studies (GWAS) are increasingly used to predict disease risks. However some included variants could be false positives and the raw estimates of effect sizes from them may be subject to selection bias. In addition, the standard PRS approach requires testing over a range of p-value thresholds, which are often chosen arbitrarily. The prediction error estimated from the optimized threshold may also be subject to an optimistic bias. To improve genomic risk prediction, we proposed new empirical Bayes approaches to recover the underlying effect sizes and used them as weights to construct PRS. We applied the new PRS to twelve cardio-metabolic traits in the Northern Finland Birth Cohort and demonstrated improvements in predictive power (in R(2)) when compared to standard PRS at the best p-value threshold. Importantly, for eleven out of the twelve traits studied, the predictive performance from the entire set of genome-wide markers outperformed the best R(2) from standard PRS at optimal p-value thresholds. Our proposed methodology essentially enables an automatic PRS weighting scheme without the need of choosing tuning parameters. The new method also performed satisfactorily in simulations. It is computationally simple and does not require assumptions on the effect size distributions.
format Online
Article
Text
id pubmed-5286518
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-52865182017-02-06 Improving polygenic risk prediction from summary statistics by an empirical Bayes approach So, Hon-Cheong Sham, Pak C. Sci Rep Article Polygenic risk scores (PRS) from genome-wide association studies (GWAS) are increasingly used to predict disease risks. However some included variants could be false positives and the raw estimates of effect sizes from them may be subject to selection bias. In addition, the standard PRS approach requires testing over a range of p-value thresholds, which are often chosen arbitrarily. The prediction error estimated from the optimized threshold may also be subject to an optimistic bias. To improve genomic risk prediction, we proposed new empirical Bayes approaches to recover the underlying effect sizes and used them as weights to construct PRS. We applied the new PRS to twelve cardio-metabolic traits in the Northern Finland Birth Cohort and demonstrated improvements in predictive power (in R(2)) when compared to standard PRS at the best p-value threshold. Importantly, for eleven out of the twelve traits studied, the predictive performance from the entire set of genome-wide markers outperformed the best R(2) from standard PRS at optimal p-value thresholds. Our proposed methodology essentially enables an automatic PRS weighting scheme without the need of choosing tuning parameters. The new method also performed satisfactorily in simulations. It is computationally simple and does not require assumptions on the effect size distributions. Nature Publishing Group 2017-02-01 /pmc/articles/PMC5286518/ /pubmed/28145530 http://dx.doi.org/10.1038/srep41262 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
So, Hon-Cheong
Sham, Pak C.
Improving polygenic risk prediction from summary statistics by an empirical Bayes approach
title Improving polygenic risk prediction from summary statistics by an empirical Bayes approach
title_full Improving polygenic risk prediction from summary statistics by an empirical Bayes approach
title_fullStr Improving polygenic risk prediction from summary statistics by an empirical Bayes approach
title_full_unstemmed Improving polygenic risk prediction from summary statistics by an empirical Bayes approach
title_short Improving polygenic risk prediction from summary statistics by an empirical Bayes approach
title_sort improving polygenic risk prediction from summary statistics by an empirical bayes approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5286518/
https://www.ncbi.nlm.nih.gov/pubmed/28145530
http://dx.doi.org/10.1038/srep41262
work_keys_str_mv AT sohoncheong improvingpolygenicriskpredictionfromsummarystatisticsbyanempiricalbayesapproach
AT shampakc improvingpolygenicriskpredictionfromsummarystatisticsbyanempiricalbayesapproach