Cargando…

Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction

The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have acc...

Descripción completa

Detalles Bibliográficos
Autores principales: Albiñana, Clara, Grove, Jakob, McGrath, John J., Agerbo, Esben, Wray, Naomi R., Bulik, Cynthia M., Nordentoft, Merete, Hougaard, David M., Werge, Thomas, Børglum, Anders D., Mortensen, Preben Bo, Privé, Florian, Vilhjálmsson, Bjarni J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206385/
https://www.ncbi.nlm.nih.gov/pubmed/33964208
http://dx.doi.org/10.1016/j.ajhg.2021.04.014
_version_ 1783708623856205824
author Albiñana, Clara
Grove, Jakob
McGrath, John J.
Agerbo, Esben
Wray, Naomi R.
Bulik, Cynthia M.
Nordentoft, Merete
Hougaard, David M.
Werge, Thomas
Børglum, Anders D.
Mortensen, Preben Bo
Privé, Florian
Vilhjálmsson, Bjarni J.
author_facet Albiñana, Clara
Grove, Jakob
McGrath, John J.
Agerbo, Esben
Wray, Naomi R.
Bulik, Cynthia M.
Nordentoft, Merete
Hougaard, David M.
Werge, Thomas
Børglum, Anders D.
Mortensen, Preben Bo
Privé, Florian
Vilhjálmsson, Bjarni J.
author_sort Albiñana, Clara
collection PubMed
description The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and individual-level data) to optimize polygenic prediction. The most widely used approach to combine data is the meta-analysis of GWAS summary statistics (meta-GWAS), but we show that it does not always provide the most accurate PRS. Through simulations and using 12 real case-control and quantitative traits from both iPSYCH and UK Biobank along with external GWAS summary statistics, we compare meta-GWAS with two alternative data-combining approaches, stacked clumping and thresholding (SCT) and meta-PRS. We find that, when large individual-level data are available, the linear combination of PRSs (meta-PRS) is both a simple alternative to meta-GWAS and often more accurate.
format Online
Article
Text
id pubmed-8206385
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-82063852021-06-23 Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction Albiñana, Clara Grove, Jakob McGrath, John J. Agerbo, Esben Wray, Naomi R. Bulik, Cynthia M. Nordentoft, Merete Hougaard, David M. Werge, Thomas Børglum, Anders D. Mortensen, Preben Bo Privé, Florian Vilhjálmsson, Bjarni J. Am J Hum Genet Article The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and individual-level data) to optimize polygenic prediction. The most widely used approach to combine data is the meta-analysis of GWAS summary statistics (meta-GWAS), but we show that it does not always provide the most accurate PRS. Through simulations and using 12 real case-control and quantitative traits from both iPSYCH and UK Biobank along with external GWAS summary statistics, we compare meta-GWAS with two alternative data-combining approaches, stacked clumping and thresholding (SCT) and meta-PRS. We find that, when large individual-level data are available, the linear combination of PRSs (meta-PRS) is both a simple alternative to meta-GWAS and often more accurate. Elsevier 2021-06-03 2021-05-07 /pmc/articles/PMC8206385/ /pubmed/33964208 http://dx.doi.org/10.1016/j.ajhg.2021.04.014 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Albiñana, Clara
Grove, Jakob
McGrath, John J.
Agerbo, Esben
Wray, Naomi R.
Bulik, Cynthia M.
Nordentoft, Merete
Hougaard, David M.
Werge, Thomas
Børglum, Anders D.
Mortensen, Preben Bo
Privé, Florian
Vilhjálmsson, Bjarni J.
Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction
title Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction
title_full Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction
title_fullStr Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction
title_full_unstemmed Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction
title_short Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction
title_sort leveraging both individual-level genetic data and gwas summary statistics increases polygenic prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206385/
https://www.ncbi.nlm.nih.gov/pubmed/33964208
http://dx.doi.org/10.1016/j.ajhg.2021.04.014
work_keys_str_mv AT albinanaclara leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT grovejakob leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT mcgrathjohnj leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT agerboesben leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT wraynaomir leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT bulikcynthiam leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT nordentoftmerete leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT hougaarddavidm leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT wergethomas leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT børglumandersd leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT mortensenprebenbo leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT priveflorian leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction
AT vilhjalmssonbjarnij leveragingbothindividuallevelgeneticdataandgwassummarystatisticsincreasespolygenicprediction