Cargando…

Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits

The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R (...

Descripción completa

Detalles Bibliográficos
Autores principales: Jung, Hyein, Jung, Hae-Un, Baek, Eun Ju, Chung, Ju Yeon, Kwon, Shin Young, Kang, Ji-One, Lim, Ji Eun, Oh, Bermseok
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10203621/
https://www.ncbi.nlm.nih.gov/pubmed/37229196
http://dx.doi.org/10.3389/fgene.2023.1150889
_version_ 1785045676161236992
author Jung, Hyein
Jung, Hae-Un
Baek, Eun Ju
Chung, Ju Yeon
Kwon, Shin Young
Kang, Ji-One
Lim, Ji Eun
Oh, Bermseok
author_facet Jung, Hyein
Jung, Hae-Un
Baek, Eun Ju
Chung, Ju Yeon
Kwon, Shin Young
Kang, Ji-One
Lim, Ji Eun
Oh, Bermseok
author_sort Jung, Hyein
collection PubMed
description The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R ( 2 ) value. One of the key assumptions of linear regression is that the variance of the residual should be constant at each level of the predictor variables, called homoscedasticity. However, some studies show that PRS models exhibit heteroscedasticity between PRS and traits. This study analyzes whether heteroscedasticity exists in PRS models of diverse disease-related traits and, if any, it affects the accuracy of PRS-based prediction in 354,761 Europeans from the UK Biobank. We constructed PRSs for 15 quantitative traits using LDpred2 and estimated the existence of heteroscedasticity between PRSs and 15 traits using three different tests of the Breusch-Pagan (BP) test, score test, and F test. Thirteen out of fifteen traits show significant heteroscedasticity. Further replication using new PRSs from the PGS catalog and independent samples (N = 23,620) from the UK Biobank confirmed the heteroscedasticity in ten traits. As a result, ten out of fifteen quantitative traits show statistically significant heteroscedasticity between the PRS and each trait. There was a greater variance of residuals as PRS increased, and the prediction accuracy at each level of PRS tended to decrease as the variance of residuals increased. In conclusion, heteroscedasticity was frequently observed in the PRS-based prediction models of quantitative traits, and the accuracy of the predictive model may differ according to PRS values. Therefore, prediction models using the PRS should be constructed by considering heteroscedasticity.
format Online
Article
Text
id pubmed-10203621
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-102036212023-05-24 Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits Jung, Hyein Jung, Hae-Un Baek, Eun Ju Chung, Ju Yeon Kwon, Shin Young Kang, Ji-One Lim, Ji Eun Oh, Bermseok Front Genet Genetics The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R ( 2 ) value. One of the key assumptions of linear regression is that the variance of the residual should be constant at each level of the predictor variables, called homoscedasticity. However, some studies show that PRS models exhibit heteroscedasticity between PRS and traits. This study analyzes whether heteroscedasticity exists in PRS models of diverse disease-related traits and, if any, it affects the accuracy of PRS-based prediction in 354,761 Europeans from the UK Biobank. We constructed PRSs for 15 quantitative traits using LDpred2 and estimated the existence of heteroscedasticity between PRSs and 15 traits using three different tests of the Breusch-Pagan (BP) test, score test, and F test. Thirteen out of fifteen traits show significant heteroscedasticity. Further replication using new PRSs from the PGS catalog and independent samples (N = 23,620) from the UK Biobank confirmed the heteroscedasticity in ten traits. As a result, ten out of fifteen quantitative traits show statistically significant heteroscedasticity between the PRS and each trait. There was a greater variance of residuals as PRS increased, and the prediction accuracy at each level of PRS tended to decrease as the variance of residuals increased. In conclusion, heteroscedasticity was frequently observed in the PRS-based prediction models of quantitative traits, and the accuracy of the predictive model may differ according to PRS values. Therefore, prediction models using the PRS should be constructed by considering heteroscedasticity. Frontiers Media S.A. 2023-05-09 /pmc/articles/PMC10203621/ /pubmed/37229196 http://dx.doi.org/10.3389/fgene.2023.1150889 Text en Copyright © 2023 Jung, Jung, Baek, Chung, Kwon, Kang, Lim and Oh. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Jung, Hyein
Jung, Hae-Un
Baek, Eun Ju
Chung, Ju Yeon
Kwon, Shin Young
Kang, Ji-One
Lim, Ji Eun
Oh, Bermseok
Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
title Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
title_full Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
title_fullStr Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
title_full_unstemmed Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
title_short Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
title_sort investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10203621/
https://www.ncbi.nlm.nih.gov/pubmed/37229196
http://dx.doi.org/10.3389/fgene.2023.1150889
work_keys_str_mv AT junghyein investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT junghaeun investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT baekeunju investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT chungjuyeon investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT kwonshinyoung investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT kangjione investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT limjieun investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits
AT ohbermseok investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits