Cargando…

Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists

A polygenic risk score estimates the genetic risk of an individual for some disease or trait, calculated by aggregating the effect of many common variants associated with the condition. With the increasing availability of genetic data in large cohort studies such as the UK Biobank, inclusion of this...

Descripción completa

Detalles Bibliográficos
Autores principales: Collister, Jennifer A., Liu, Xiaonan, Clifton, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8894758/
https://www.ncbi.nlm.nih.gov/pubmed/35251129
http://dx.doi.org/10.3389/fgene.2022.818574
_version_ 1784662757192237056
author Collister, Jennifer A.
Liu, Xiaonan
Clifton, Lei
author_facet Collister, Jennifer A.
Liu, Xiaonan
Clifton, Lei
author_sort Collister, Jennifer A.
collection PubMed
description A polygenic risk score estimates the genetic risk of an individual for some disease or trait, calculated by aggregating the effect of many common variants associated with the condition. With the increasing availability of genetic data in large cohort studies such as the UK Biobank, inclusion of this genetic risk as a covariate in statistical analyses is becoming more widespread. Previously this required specialist knowledge, but as tooling and data availability have improved it has become more feasible for statisticians and epidemiologists to calculate existing scores themselves for use in analyses. While tutorial resources exist for conducting genome-wide association studies and generating of new polygenic risk scores, fewer guides exist for the simple calculation and application of existing genetic scores. This guide outlines the key steps of this process: selection of suitable polygenic risk scores from the literature, extraction of relevant genetic variants and verification of their quality, calculation of the risk score and key considerations of its inclusion in statistical models, using the UK Biobank imputed data as a model data set. Many of the techniques in this guide will generalize to other datasets, however we also focus on some of the specific techniques required for using data in the formats UK Biobank have selected. This includes some of the challenges faced when working with large numbers of variants, where the computation time required by some tools is impractical. While we have focused on only a couple of tools, which may not be the best ones for every given aspect of the process, one barrier to working with genetic data is the sheer volume of tools available, and the difficulty for a novice to assess their viability. By discussing in depth a couple of tools that are adequate for the calculation even at large scale, we hope to make polygenic risk scores more accessible to a wider range of researchers.
format Online
Article
Text
id pubmed-8894758
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-88947582022-03-05 Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists Collister, Jennifer A. Liu, Xiaonan Clifton, Lei Front Genet Genetics A polygenic risk score estimates the genetic risk of an individual for some disease or trait, calculated by aggregating the effect of many common variants associated with the condition. With the increasing availability of genetic data in large cohort studies such as the UK Biobank, inclusion of this genetic risk as a covariate in statistical analyses is becoming more widespread. Previously this required specialist knowledge, but as tooling and data availability have improved it has become more feasible for statisticians and epidemiologists to calculate existing scores themselves for use in analyses. While tutorial resources exist for conducting genome-wide association studies and generating of new polygenic risk scores, fewer guides exist for the simple calculation and application of existing genetic scores. This guide outlines the key steps of this process: selection of suitable polygenic risk scores from the literature, extraction of relevant genetic variants and verification of their quality, calculation of the risk score and key considerations of its inclusion in statistical models, using the UK Biobank imputed data as a model data set. Many of the techniques in this guide will generalize to other datasets, however we also focus on some of the specific techniques required for using data in the formats UK Biobank have selected. This includes some of the challenges faced when working with large numbers of variants, where the computation time required by some tools is impractical. While we have focused on only a couple of tools, which may not be the best ones for every given aspect of the process, one barrier to working with genetic data is the sheer volume of tools available, and the difficulty for a novice to assess their viability. By discussing in depth a couple of tools that are adequate for the calculation even at large scale, we hope to make polygenic risk scores more accessible to a wider range of researchers. Frontiers Media S.A. 2022-02-18 /pmc/articles/PMC8894758/ /pubmed/35251129 http://dx.doi.org/10.3389/fgene.2022.818574 Text en Copyright © 2022 Collister, Liu and Clifton. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Collister, Jennifer A.
Liu, Xiaonan
Clifton, Lei
Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists
title Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists
title_full Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists
title_fullStr Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists
title_full_unstemmed Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists
title_short Calculating Polygenic Risk Scores (PRS) in UK Biobank: A Practical Guide for Epidemiologists
title_sort calculating polygenic risk scores (prs) in uk biobank: a practical guide for epidemiologists
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8894758/
https://www.ncbi.nlm.nih.gov/pubmed/35251129
http://dx.doi.org/10.3389/fgene.2022.818574
work_keys_str_mv AT collisterjennifera calculatingpolygenicriskscoresprsinukbiobankapracticalguideforepidemiologists
AT liuxiaonan calculatingpolygenicriskscoresprsinukbiobankapracticalguideforepidemiologists
AT cliftonlei calculatingpolygenicriskscoresprsinukbiobankapracticalguideforepidemiologists