Cargando…
Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data
Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from gen...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Journal Experts
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312948/ https://www.ncbi.nlm.nih.gov/pubmed/37398263 http://dx.doi.org/10.21203/rs.3.rs-2939390/v1 |
_version_ | 1785067016657305600 |
---|---|
author | Jiang, Wei Chen, Ling Girgenti, Matthew J. Zhao, Hongyu |
author_facet | Jiang, Wei Chen, Ling Girgenti, Matthew J. Zhao, Hongyu |
author_sort | Jiang, Wei |
collection | PubMed |
description | Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from genome-wide association studies (GWAS). Some of these methods require access to another external individual-level GWAS dataset to tune the hyperparameters, which can be difficult because of privacy and security-related concerns. Additionally, leaving out partial data for hyperparameter tuning can reduce the predictive accuracy of the constructed PRS model. In this article, we propose a novel method, called PRStuning, to automatically tune hyperparameters for different PRS methods using only GWAS summary statistics from the training data. The core idea is to first predict the performance of the PRS method with different parameter values, and then select the parameters with the best prediction performance. Because directly using the effects observed from the training data tends to overestimate the performance in the testing data (a phenomenon known as overfitting), we adopt an empirical Bayes approach to shrinking the predicted performance in accordance with the estimated genetic architecture of the disease. Results from extensive simulations and real data applications demonstrate that PRStuning can accurately predict the PRS performance across PRS methods and parameters, and it can help select the best-performing parameters. |
format | Online Article Text |
id | pubmed-10312948 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Journal Experts |
record_format | MEDLINE/PubMed |
spelling | pubmed-103129482023-07-01 Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data Jiang, Wei Chen, Ling Girgenti, Matthew J. Zhao, Hongyu Res Sq Article Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from genome-wide association studies (GWAS). Some of these methods require access to another external individual-level GWAS dataset to tune the hyperparameters, which can be difficult because of privacy and security-related concerns. Additionally, leaving out partial data for hyperparameter tuning can reduce the predictive accuracy of the constructed PRS model. In this article, we propose a novel method, called PRStuning, to automatically tune hyperparameters for different PRS methods using only GWAS summary statistics from the training data. The core idea is to first predict the performance of the PRS method with different parameter values, and then select the parameters with the best prediction performance. Because directly using the effects observed from the training data tends to overestimate the performance in the testing data (a phenomenon known as overfitting), we adopt an empirical Bayes approach to shrinking the predicted performance in accordance with the estimated genetic architecture of the disease. Results from extensive simulations and real data applications demonstrate that PRStuning can accurately predict the PRS performance across PRS methods and parameters, and it can help select the best-performing parameters. American Journal Experts 2023-05-31 /pmc/articles/PMC10312948/ /pubmed/37398263 http://dx.doi.org/10.21203/rs.3.rs-2939390/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Jiang, Wei Chen, Ling Girgenti, Matthew J. Zhao, Hongyu Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data |
title | Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data |
title_full | Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data |
title_fullStr | Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data |
title_full_unstemmed | Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data |
title_short | Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data |
title_sort | tuning parameters for polygenic risk score methods using gwas summary statistics from training data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312948/ https://www.ncbi.nlm.nih.gov/pubmed/37398263 http://dx.doi.org/10.21203/rs.3.rs-2939390/v1 |
work_keys_str_mv | AT jiangwei tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata AT chenling tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata AT girgentimatthewj tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata AT zhaohongyu tuningparametersforpolygenicriskscoremethodsusinggwassummarystatisticsfromtrainingdata |