Cargando…

Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data

Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from gen...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Wei, Chen, Ling, Girgenti, Matthew J., Zhao, Hongyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312948/
https://www.ncbi.nlm.nih.gov/pubmed/37398263
http://dx.doi.org/10.21203/rs.3.rs-2939390/v1
_version_ 1785067016657305600
author Jiang, Wei
Chen, Ling
Girgenti, Matthew J.
Zhao, Hongyu
author_facet Jiang, Wei
Chen, Ling
Girgenti, Matthew J.
Zhao, Hongyu
author_sort Jiang, Wei
collection PubMed
description Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from genome-wide association studies (GWAS). Some of these methods require access to another external individual-level GWAS dataset to tune the hyperparameters, which can be difficult because of privacy and security-related concerns. Additionally, leaving out partial data for hyperparameter tuning can reduce the predictive accuracy of the constructed PRS model. In this article, we propose a novel method, called PRStuning, to automatically tune hyperparameters for different PRS methods using only GWAS summary statistics from the training data. The core idea is to first predict the performance of the PRS method with different parameter values, and then select the parameters with the best prediction performance. Because directly using the effects observed from the training data tends to overestimate the performance in the testing data (a phenomenon known as overfitting), we adopt an empirical Bayes approach to shrinking the predicted performance in accordance with the estimated genetic architecture of the disease. Results from extensive simulations and real data applications demonstrate that PRStuning can accurately predict the PRS performance across PRS methods and parameters, and it can help select the best-performing parameters.
format Online
Article
Text
id pubmed-10312948
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-103129482023-07-01 Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data Jiang, Wei Chen, Ling Girgenti, Matthew J. Zhao, Hongyu Res Sq Article Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from genome-wide association studies (GWAS). Some of these methods require access to another external individual-level GWAS dataset to tune the hyperparameters, which can be difficult because of privacy and security-related concerns. Additionally, leaving out partial data for hyperparameter tuning can reduce the predictive accuracy of the constructed PRS model. In this article, we propose a novel method, called PRStuning, to automatically tune hyperparameters for different PRS methods using only GWAS summary statistics from the training data. The core idea is to first predict the performance of the PRS method with different parameter values, and then select the parameters with the best prediction performance. Because directly using the effects observed from the training data tends to overestimate the performance in the testing data (a phenomenon known as overfitting), we adopt an empirical Bayes approach to shrinking the predicted performance in accordance with the estimated genetic architecture of the disease. Results from extensive simulations and real data applications demonstrate that PRStuning can accurately predict the PRS performance across PRS methods and parameters, and it can help select the best-performing parameters. American Journal Experts 2023-05-31 /pmc/articles/PMC10312948/ /pubmed/37398263 http://dx.doi.org/10.21203/rs.3.rs-2939390/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Jiang, Wei
Chen, Ling
Girgenti, Matthew J.
Zhao, Hongyu
Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
title Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
title_full Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
title_fullStr Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
title_full_unstemmed Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
title_short Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
title_sort tuning parameters for polygenic risk score methods using gwas summary statistics from training data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312948/
https://www.ncbi.nlm.nih.gov/pubmed/37398263
http://dx.doi.org/10.21203/rs.3.rs-2939390/v1
work_keys_str_mv AT jiangwei tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata
AT chenling tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata
AT girgentimatthewj tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata
AT zhaohongyu tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata