Cargando…
Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R (...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10203621/ https://www.ncbi.nlm.nih.gov/pubmed/37229196 http://dx.doi.org/10.3389/fgene.2023.1150889 |
_version_ | 1785045676161236992 |
---|---|
author | Jung, Hyein Jung, Hae-Un Baek, Eun Ju Chung, Ju Yeon Kwon, Shin Young Kang, Ji-One Lim, Ji Eun Oh, Bermseok |
author_facet | Jung, Hyein Jung, Hae-Un Baek, Eun Ju Chung, Ju Yeon Kwon, Shin Young Kang, Ji-One Lim, Ji Eun Oh, Bermseok |
author_sort | Jung, Hyein |
collection | PubMed |
description | The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R ( 2 ) value. One of the key assumptions of linear regression is that the variance of the residual should be constant at each level of the predictor variables, called homoscedasticity. However, some studies show that PRS models exhibit heteroscedasticity between PRS and traits. This study analyzes whether heteroscedasticity exists in PRS models of diverse disease-related traits and, if any, it affects the accuracy of PRS-based prediction in 354,761 Europeans from the UK Biobank. We constructed PRSs for 15 quantitative traits using LDpred2 and estimated the existence of heteroscedasticity between PRSs and 15 traits using three different tests of the Breusch-Pagan (BP) test, score test, and F test. Thirteen out of fifteen traits show significant heteroscedasticity. Further replication using new PRSs from the PGS catalog and independent samples (N = 23,620) from the UK Biobank confirmed the heteroscedasticity in ten traits. As a result, ten out of fifteen quantitative traits show statistically significant heteroscedasticity between the PRS and each trait. There was a greater variance of residuals as PRS increased, and the prediction accuracy at each level of PRS tended to decrease as the variance of residuals increased. In conclusion, heteroscedasticity was frequently observed in the PRS-based prediction models of quantitative traits, and the accuracy of the predictive model may differ according to PRS values. Therefore, prediction models using the PRS should be constructed by considering heteroscedasticity. |
format | Online Article Text |
id | pubmed-10203621 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-102036212023-05-24 Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits Jung, Hyein Jung, Hae-Un Baek, Eun Ju Chung, Ju Yeon Kwon, Shin Young Kang, Ji-One Lim, Ji Eun Oh, Bermseok Front Genet Genetics The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R ( 2 ) value. One of the key assumptions of linear regression is that the variance of the residual should be constant at each level of the predictor variables, called homoscedasticity. However, some studies show that PRS models exhibit heteroscedasticity between PRS and traits. This study analyzes whether heteroscedasticity exists in PRS models of diverse disease-related traits and, if any, it affects the accuracy of PRS-based prediction in 354,761 Europeans from the UK Biobank. We constructed PRSs for 15 quantitative traits using LDpred2 and estimated the existence of heteroscedasticity between PRSs and 15 traits using three different tests of the Breusch-Pagan (BP) test, score test, and F test. Thirteen out of fifteen traits show significant heteroscedasticity. Further replication using new PRSs from the PGS catalog and independent samples (N = 23,620) from the UK Biobank confirmed the heteroscedasticity in ten traits. As a result, ten out of fifteen quantitative traits show statistically significant heteroscedasticity between the PRS and each trait. There was a greater variance of residuals as PRS increased, and the prediction accuracy at each level of PRS tended to decrease as the variance of residuals increased. In conclusion, heteroscedasticity was frequently observed in the PRS-based prediction models of quantitative traits, and the accuracy of the predictive model may differ according to PRS values. Therefore, prediction models using the PRS should be constructed by considering heteroscedasticity. Frontiers Media S.A. 2023-05-09 /pmc/articles/PMC10203621/ /pubmed/37229196 http://dx.doi.org/10.3389/fgene.2023.1150889 Text en Copyright © 2023 Jung, Jung, Baek, Chung, Kwon, Kang, Lim and Oh. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Jung, Hyein Jung, Hae-Un Baek, Eun Ju Chung, Ju Yeon Kwon, Shin Young Kang, Ji-One Lim, Ji Eun Oh, Bermseok Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_full | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_fullStr | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_full_unstemmed | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_short | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_sort | investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10203621/ https://www.ncbi.nlm.nih.gov/pubmed/37229196 http://dx.doi.org/10.3389/fgene.2023.1150889 |
work_keys_str_mv | AT junghyein investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT junghaeun investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT baekeunju investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT chungjuyeon investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT kwonshinyoung investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT kangjione investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT limjieun investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT ohbermseok investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits |